Advertisement · 728 × 90

Posts by Matt McGuffie

Adult male (small) and female (large) of Armillifer sp. Looks like a big and small crinkle cut french fry, also with paired hooks at the front. 

Both photos from this paper:
https://journals.plos.org/plosntds/article?id=10.1371/journal.pntd.0000320

Adult male (small) and female (large) of Armillifer sp. Looks like a big and small crinkle cut french fry, also with paired hooks at the front. Both photos from this paper: https://journals.plos.org/plosntds/article?id=10.1371/journal.pntd.0000320

Adult female Linguatula serrata. Looks like a wrinkly tube, a bit wider at the front and tapering to a tail. There are some hooklike bits at the front.

Adult female Linguatula serrata. Looks like a wrinkly tube, a bit wider at the front and tapering to a tail. There are some hooklike bits at the front.

🚨 Don't eat uncooked snake! Your eyes & lungs can be parasitized by pentastomids, which are a CRUSTACEAN. Adults lack so many features that they weren't believed to be arthropods at all until their DNA was sequenced (and sperm morphology, but that was not widely accepted). Yet, they are!
#Crustmas 🧪

2 years ago 359 82 37 35
Preview
mim: A lightweight auxiliary index to enable fast, parallel, gzipped FASTQ parsing The FASTQ file format is the lingua franca of primary data distribution and processing across most of bioinformatics. Over time, the compression, storage, transmission, and decompression of gzip compr...

Woot! The mim preprint is live now. www.biorxiv.org/content/10.1...

Happy Thanksgiving!

Cc @curiouscoding.nl

4 months ago 25 9 1 1
Preview
Intracellular competition shapes plasmid population dynamics From populations of multicellular organisms to selfish genetic elements, conflicts between levels of biological organization are central to evolution. Plasmids are extrachromosomal, self-replicating g...

Hot off the press! Our latest paper led by @fernpizza.bsky.social, understanding how plasmids evolve inside cells. These small, self-replicating DNA circles live inside bacteria and carry antibiotic resistance genes, but also compete with one another to replicate. 1/
www.science.org/doi/10.1126/...

4 months ago 437 199 11 18
Average nucleotide identity — the backbone of modern ecological genomics - Nature Reviews Genetics In this Journal Club, Luis Orellana recalls a 2005 publication by Konstantinidis and Tiedje that introduced average nucleotide identity as a sequence-based metric to determine the relatedness between ...

The average nucleotide identity (ANI) underpins how we map microbial diversity, compare species, and connect genomes to ecology.
I wrote a short piece reflecting on the discovery and significance of this metric (and really enjoyed digging into the context and story behind it!) #microsky 🧬

5 months ago 58 22 1 2
Preview
Unannotated translation products are widespread in model E. coli Genomes contain orders of magnitude more open reading frames (ORFs) than known protein coding genes, and recent work suggests there may be unannotated proteins present in even the best studied organisms. To address this gap, we used a high throughput reverse genetic toolkit to construct precise C-terminal fusions of a reporter (and control) to >120,000 ORFs in model E. coli . We found hundreds of unannotated significant hits, and individually detected >50 novel polypeptides by western blot, including ORFs within tRNA loci. Many ORFs overlap annotated genes in the sense orientation, and we found these are likely chimeric polypeptides produced by ribosomal frameshifting. Using degron based knockdowns, we identified unannotated proteins that have putative fitness effects, and we found a novel small protein that displays phenotypes consistent with a role in the mRNA degradosome. The observation of a range of unannotated translation products should lead to better annotation and understanding of the bacterial domain of life and motivates the continued exploration of genomes broadly. ### Competing Interest Statement The authors have declared no competing interest.

Unannotated translation products are widespread in model E. coli | bioRxiv www.biorxiv.org/content/10.1101/2025.09....

6 months ago 18 12 0 2

C. elegans is a real animal and we set out to understand how it comes to have its distinctive biogeography. Its ancestral center of diversity is in the higher elevation forests of Hawaii. Its closest relatives are spread across east Asia. Did they travel from Asia? [Preprint 🧵]

6 months ago 167 79 5 7

Heads up: ignore samtools dot org, similarly minimap2 dot com and likely others. It's owned by a known phishing site and while the binaries they offer look valid currently (but note they may be serving us different binaries to others), that could change.

Ie: it's not us (Samtools team)! Be warned

7 months ago 146 127 2 5
Advertisement
Video

tgv 0.1.0 release: github.com/zeqianli/tgv
- Rich CIGAR and base visualization
- Allele frequency visualization
- VCF and BED file support
- Mouse dragging and hovering
- Filter alignment

Now 90% of what I need from IGV can be done in the terminal.

Some interesting behind-the-scenes:

7 months ago 11 7 1 0
Preview
tskit_arg_visualizer: interactive plotting of ancestral recombination graphs Summary: Ancestral recombination graphs (ARGs) are a complete representation of the genetic relationships between recombining lineages and are of central importance in population genetics. Recent brea...

Excited to share our new preprint for the tskit_arg_visualizer Python package! ARGs can sometimes feel like a black box, so
@yanwong.bsky.social and I have been developing a method to programmatically drawing these graphs.

🔗 arxiv.org/abs/2508.03958

1/6

7 months ago 63 35 2 2
Post image

Oatk: a de novo assembly tool for complex plant organelle genomes. #DeNovoAssembly #OrganelleGenomes #Bioinformatics #GenomeBiology
genomebiology.biomedcentral.com/articles/10....

8 months ago 5 4 0 0
Preview
A diverse and distinct microbiome inside living trees - Nature Microbiome analyses of living trees show that a single tree can host approximately one trillion bacteria, with microbial communities distinctly partitioned between heartwood and sapwood and with minim...

#NatMicroPicks

Hidden microbial world in trees🌳

Living wood hosts trillions of bacteria making trees a complex ecosystems with major roles in forest health and function.

#PlantMicro #MicroSky

www.nature.com/articles/s41...

8 months ago 74 28 0 4
Post image

StrainR2 accurately deconvolutes strain-level abundances in synthetic microbial communities. #Metagenomics #StrainLevelAbundance #Bioinformatics
academic.oup.com/bioinformati...

8 months ago 2 2 1 0
Preview
Protein language models reveal evolutionary constraints on synonymous codon choice Evolution has shaped the genetic code, with subtle pressures leading to preferences for some synonymous codons over others. Codons are translated at different speeds by the ribosome, imposing constrai...

Protein language models reveal evolutionary constraints on synonymous codon choice
#rnasky #microsky "cotranslational localization and translational accuracy, more than cotranslational protein folding, are major drivers of selective pressure on codon choice" in yeast here 💫
doi.org/10.1101/2025...

8 months ago 3 1 0 0
Announcing taxburst, an update of the Krona software for taxonomy exploration Announcing taxburst for metagenome taxonomy!

taxburst v0.3.0 is now released - this is an update of the Krona visualization system for microbiome/metagenome taxonomy analyses. Enjoy!

8 months ago 26 17 0 0
Preview
Phyling: phylogenetic inference from annotated genomes Phyling is a fast, scalable, and user-friendly tool supporting phylogenomic reconstruction of species phylogenies directly from protein-encoded genomic data. It identifies orthologous genes by searchi...

This looks like an amazing tool
www.biorxiv.org/content/10.1...

8 months ago 44 19 3 0
Advertisement
Preview
Scaling down protein language modeling with MSA Pairformer Recent efforts in protein language modeling have focused on scaling single-sequence models and their training data, requiring vast compute resources that limit accessibility. Although models that use ...

Excited to share work with
Zhidian Zhang, @milot.bsky.social, @martinsteinegger.bsky.social, and @sokrypton.org
biorxiv.org/content/10.1...
TLDR: We introduce MSA Pairformer, a 111M parameter protein language model that challenges the scaling paradigm in self-supervised protein language modeling🧵

8 months ago 97 43 1 1
Preview
GitHub - lh3/longdust: Identify long STRs, VNTRs, satellite DNA and other low-complexity regions in a genome Identify long STRs, VNTRs, satellite DNA and other low-complexity regions in a genome - lh3/longdust

Fun new tool from Heng Li. Thinking maybe I can use this to help find plasmid replication gene correlated repeat regions - though he specifically mentions it's not for tandem repeat regions. Hmm. 🖥️🧬

github.com/lh3/longdust

8 months ago 7 2 0 0
Preview
Synteny-aware functional annotation of bacteriophage genomes with Phynteny Accurate genome annotation is fundamental to decoding viral diversity and understanding bacteriophage biology; yet, the majority of bacteriophage genes remain functionally uncharacterised. Bacteriopha...

🚨 New preprint 🚨

My phage annotation tool, Phynteny, finally has a preprint and a brand new version powered by a cool AI transformer architecture and protein language models! #phagesky

www.biorxiv.org/content/10.1...

8 months ago 85 42 2 1

A bit late to joining the Bluesky party, but it's great to see all the amazing scientists who are on this platform! Looking forward to connecting with all of you here (on twitter as @niranjantw ... so keeping the handle consistent).

11 months ago 5 2 2 0
Post image

AFESM: a metagenomic guide through the protein structure universe! We clustered 821M structures (AFDB&ESMatlas) into 5.12M groups; revealing biome-specific groups, only 1 new fold even after AlphaFold2 re-prediction & many novel domain combos. 🧵
🌐 afesm.foldseek.com
📄 www.biorxiv.org/content/10.1...

11 months ago 141 70 4 4
A pack of stickers with the logo of our new tool, K-mer Fast Counter a.k.a. KFC

A pack of stickers with the logo of our new tool, K-mer Fast Counter a.k.a. KFC

I'll be presenting our work on hyper-k-mers at #RECOMB today at 10:40 KST!

You can get a sneak peek at the slides here: igor.martayan.org/slides-recom...

Come say hi if you'd like to chat, or just get one of these cute stickers!

11 months ago 18 4 1 0

Assemblies of long-read metagenomes suffer from diverse errors www.biorxiv.org/content/10.1101/2025.04....

11 months ago 23 11 0 1
Preview
Genomic divergence across the tree of life | PNAS Nucleotide sequence data are being harnessed to identify species, even in cases in which organisms themselves are neither in hand nor witnessed. Bu...

How #genome-wide sequence divergence maps to species status www.pnas.org/doi/10.1073/... #biodiversity #genomics

1 year ago 60 33 0 3
Advertisement
Post image

Uncalled4: a toolkit for nanopore signal alignment, analysis and visualization of DNA and RNA modifications.

www.nature.com/articles/s41...

1 year ago 48 26 1 1
Preview
Telomeric transposons are pervasive in linear bacterial genomes Eukaryotes have linear DNA, and their telomeres are hotspots for transposons, which in some cases took over telomere maintenance. We identified several families of independently evolved telomeric tran...

wow, telomeric transposons in bacteria with linear chromosomes! (of course this was first figured out in flies, inc by Bob Levis, who i was happy to see few days ago at the fly meeting). 🪰

www.science.org/doi/10.1126/...

www.sciencedirect.com/science/arti...

www.sciencedirect.com/science/arti...

1 year ago 62 36 0 1
Preview
Genetics, ecology and evolution of phage satellites Nature Reviews Microbiology, Published online: 27 March 2025; doi:10.1038/s41579-025-01156-zIn this Review, Penadés et al. explore the genetics, potential origins and life cycle of phage satellites, and they discuss the impact of these elements on the evolution of other mobile genetic elements and their host bacteria.

New online! Genetics, ecology and evolution of phage satellites

1 year ago 42 28 0 2
Preview
GitHub - rrwick/condaenvlist: a simple tool for listing conda environments with descriptions a simple tool for listing conda environments with descriptions - rrwick/condaenvlist

Do you (like me) create a bunch of conda environments, then later forget what they're for, when they were last updated, or which tools are in them?

If so, you might this little project: github.com/rrwick/conda...

1 year ago 78 40 1 1
Preview
GitHub - yangao07/longcallD: A local-haplotagging-based small and structural variant caller A local-haplotagging-based small and structural variant caller - yangao07/longcallD

longcallD is a new variant caller for genomic long reads. It jointly calls phased small and structural variants. Single binary, one command line for the whole process. Comparable accuracy to mainstream callers. Great work by Yan Gao. github.com/yangao07/lon...

1 year ago 105 49 3 3
Bakta database This data repository contains the mandatory DB for Bakta. It is available in two versions: the default (db.tar.gz or) and a lightweight alternative (db-light.tar.gz). Bakta is a tool for the rapid & s...

🦠🧬🖥️ New Bakta DB v6.0 released!

After a year, it was time for a Bakta database update - and it's a huge one:
- IPS: 330.9M
- PSC: 135.3M
- PSCC: 37M

doi.org/10.5281/zeno...

👇 1/6

1 year ago 12 9 1 0
Post image

Interested in bacterial genomes?

Hundreds of thousands, even millions?

All annotated, taxonomically classified, integrated with metadata.

Easily searchable, viewable, downloadable, in sync with #AllTheBacteria.

Then BakRep is for you! Poster P-CM-102 @vaam-microbes.bsky.social #VAAM25

1 year ago 18 9 2 0
Advertisement