Russ Corbett-Detig (@russcd) Bsky

Pangenomes, but scalable.

Panmap: phylogeny-guided framework for read alignment, genotyping, sample placement on pangenomes. 600x smaller indexes, faster builds, and placement from 20K to 8M genomes. @amkram.bsky.social @alanbyzhang.bsky.social @russcd.bsky.social

www.biorxiv.org/content/10.6...

2 weeks ago 10 4 0 2

eDNA is possibly the coolest application! Below: read-level placement across a pan-vertebrate mitochondrial pangenome (left) where a clear cluster of read assignments supports the presence of many mammoth reads (right). Processing ~9 billion reads took just 8 minutes!

3 weeks ago 12 5 0 0

Even in complex mixtures, Panmap does a really good job of sorting out their identities and relative contributions to the sample.

3 weeks ago 3 0 1 0

Panmap can also evaluate sample mixtures (e.g., what you get from wastewater or eDNA), by scoring the reads separately against every node and then estimating the abundances of lineages via EM.

3 weeks ago 1 0 1 0

Because we score placements dynamically during a single tree traversal, it is very fast. Panmap takes less than one second to place a SARS-CoV-2 sample onto a 20,000 sample phylogeny, align the read, and call genotypes.

3 weeks ago 1 0 1 0

Panmap gets really accurate placements, even at very low sequencing depths.

3 weeks ago 1 0 1 0

There is a ton of cool stuff you can do using the index. In the first application, Panmap place a single sample onto the tree *without assembly* using the raw reads and then can align and genotype based on this closest known relative.

3 weeks ago 2 0 1 0

The central innovation of Panmap is indexing PanMANs by producing a “syncmer annotated tree”, where seed edits are stored only where sequences on the phylogeny change (including inferred ancestors). The index therefore exploits a type of “phylogenetic compression” and is very compact.

3 weeks ago 2 0 1 0

Compressive pangenomics using mutation-annotated networks - Nature Genetics Pangenome Mutation-Annotated Network (PanMAN) is a pangenome data structure that encodes shared mutational and evolutionary histories across microbial genomes, providing both high compression and enha...

This is the result of our incredibly rewarding collaboration with Yatish Turakhia and members of his lab. In particular, and foundational to our work, we previously developed the underlying phylogenetic Pangenome data structure, PanMAN.
www.nature.com/articles/s41...

3 weeks ago 2 1 1 0

Alex Kramer, Alan Zhang and friends posted our preprint today. In it, we introduce Panmap, a tool for phylogenetic placement, assembly, lineage abundance estimation, and eDNA assignment using phylogenetic pangenomes.

www.biorxiv.org/content/10.6...

3 weeks ago 28 17 1 0

Open question for anyone that develops/maintains of bioinformatics tools, especially those used for public health - what do you think is the right way to fund this long term?

Grants are great for new research directions, but aren't really appropriate for most tool dev/maintenance.

1 month ago 9 6 4 0

Disclaimer: I am a terrible web developer and this extension should not be used by anyone.

1 month ago 0 0 0 0

GitHub - russcd/corePivotal Contribute to russcd/corePivotal development by creating an account on GitHub.

The chrome extension can do this for you! github.com/russcd/coreP...

1 month ago 1 0 1 0

Thanks so much, Ian! This means a lot coming from a mighty core-pivotal 17.

1 month ago 1 0 1 0

I’m already getting community feedback on the core-pivotal index. I truly appreciate it. But in the spirit of core-pivoteering, I will only implement suggestions from whoever has the highest core-pivotal index.

1 month ago 8 1 0 0

Why this matters:

As biobanks + population sequencing projects scale into the hundreds of thousands (and millions), the bottleneck isn’t just inference. It’s exploration.

Feedback welcome 🌲

1 month ago 3 0 0 0

Truly, we harvest the truffulas standing on very tall shoulders.

1 month ago 5 0 2 0

On the visuals inspiration side: a huge thanks to @theo.io and Taxonium.

Taxonium showed the community that we can interactively explore enormous phylogenetic trees in the browser at pandemic scale.

Lorax brings a similar philosophy to ARGs.

1 month ago 1 0 1 0

Under the hood, Lorax runs on the incredibly powerful data model + API from tskit, developed by Jerome Kelleher and collaborators.

Tree sequences make it possible to store and traverse genome-wide genealogies efficiently, and Lorax uses that structure directly in the backend.

1 month ago 2 0 1 0

What does Lorax do?

It lets you dynamically explore tree sequences at massive scales - zooming through local trees, inspecting mutations, querying ancestry.

Built for *huge* datasets with up to millions of samples.

1 month ago 2 0 1 0

tl;dr:

Read about Lorax: www.biorxiv.org/content/10.6...

Try Lorax: lorax.ucsc.edu/view/1kg_chr...

Install Lorax: pypi.org/project/lora...

1 month ago 2 0 1 0

@pratikkatte.bsky.social and I just released Lorax 🌲, a tool for interactive exploration of biobank-scale ancestral recombination graphs (ARGs).

If you’ve ever wanted to scroll across the ancestries of thousands of genomes… this is for you.

1 month ago 39 26 2 0

We have posted data providing real-time measurement of human neutralizing antibody landscape to seasonal influenza.

Data explain spread of subclades K (H3N2) & D.3.1.1 (H1N1), identify subclade K subvariants w reduced neutralization, & can inform choice of strains for next vaccine.

2 months ago 76 37 1 1

Addressing pandemic-wide systematic errors in the SARS-CoV-2 phylogeny - Nature Methods This Resource paper presents a global SARS-CoV-2 phylogenetic tree of 4,471,579 high-quality genomes consistently constructed by Viridian, an efficient amplicon-aware assembler.

A long time ago in a galaxy far away, there was a SARS-CoV-2 pandemic. Our paper, led by @martibartfast.bsky.social
a) correcting errors in 4.5 million genomes & their phylogeny
b) improving representation of the Global South in public data
www.nature.com/articles/s41...
(thread 1/n)

2 months ago 137 66 3 6

Efficient Estimation of Nucleotide Diversity and Divergence Using Callable Loci (and More) Abstract. The increasing scale of population genomic datasets presents computational challenges in estimating summary statistics such as nucleotide diversi

@cademirch.bsky.social @erikenbody.bsky.social TB Sackton & @russcd.bsky.social introduce Callable Loci And More (clam), a tool that leverages callable loci to accurately estimate population genetic statistics (π, dxy, and FST).

🔗 doi.org/10.1093/molbev/msaf282

#evobio #molbio #compbio

4 months ago 20 14 0 0

Signals of Ancestry-Specific Selection in Gentle Africanized Honey Bees Abstract. Understanding the genetic basis of adaptive responses to environmental and human mediated pressures is a central concern in evolutionary biology.

Genetti & @russcd.bsky.social investigated Puerto Rico honeybees, suggesting that local pressures on bee behavior may have induced changes in alleles linked to different ancestries at loci involved in neuronal development, behavior, and mating.

🔗 doi.org/10.1093/gbe/evaf217

#genome #evolution

4 months ago 3 1 0 0

I am 100% stoked to lead this effort again this year! Come on out to beautiful Santa Cruz and show off your awesome-est science.

4 months ago 3 1 0 0

Our lab has an opening for a research technician to contribute to our efforts to understand RSV evolution & its impact on antibody countermeasures (see journals.asm.org/doi/full/10....). The tech will also help w lab management.

If interested, apply here: careers-fhcrc.icims.com/jobs/29940/job

8 months ago 30 19 0 1

Thank you! I might need to steal this wonderful artwork!

9 months ago 1 0 1 0

Super glad to see this out !

Lovely collaboration with @russcd.bsky.social @crouxevo.bsky.social and the groups of @meauxjuliette.bsky.social @plantadaptation.bsky.social

10 months ago 13 7 0 0

Posts by Russ Corbett-Detig