Advertisement · 728 × 90

Posts by Antonio Camargo

It’s that wonderful time of year again. A new GTDB release is out :)

6 days ago 14 4 0 1

Congratulations, @sdeorowicz.bsky.social!

6 days ago 0 0 0 0
Preview
Fast and accurate multiple-protein-sequence alignment at scale with FAMSA2 - Nature Biotechnology FAMSA2 accurately aligns millions of protein sequences at high speed.

10 years after the first FAMSA paper, its successor is now published in Nat Biotech! We believe that FAMSA2 can enable analyses of large protein collections that were previously unattainable. Thank you, Andrzej and Cedric, for great collaboration
www.nature.com/articles/s41...

1 week ago 56 22 3 2

It doesn't really test the codes explicitly. It tests several gene models (some of which use alternative genetic codes) and picks the best one. If you are looking for new genetic codes, this won't work

2 weeks ago 1 0 0 0

It was a while ago, so I don't really remember the reasons. It could have been that I wanted something very specific... I do recall that I could not find a clear reference. Sorry for the bad feedback :/

Again, I'm not super skilled in Rust, so maybe it was something that you would consider trivial.

1 month ago 1 0 1 0

I tried to integrate it into a package a while ago, but my impression is that you didn't design it to be used as a library (could just be me being terrible in Rust), so I ended up dropping the idea

1 month ago 1 0 1 0

For conjugation. I recommend CONJscan.

For replication initiators, it's more complicated. You can try MOB-Typer (but it is a bit tough to get working) or annotating finding the replication initiator with Pfams and then looking into the closest homologs. I'm not aware of an easy to run pipeline :/

1 month ago 1 0 0 0

We've been looking into ways to build a more encompassing classification system. As @titus.idyll.org mentioned, there's so much HGT that the assumptions that we use for bacterial taxonomy don't make sense anymore. But there might be a way to find a meaningful pattern within the HGT web :)

1 month ago 2 0 1 0

I'd say that the best thing you can do at the moment is characterize environmental plasmids according to their conjugation system and replication initiator family.

1 month ago 1 0 2 0
Advertisement

That's a good question. Right now, I believe COPLA is the best "taxonomy-like" thing that we have. But the software is tough to run and the database is biased towards plasmids from cultivated bacteria (which is unavoidable)

1 month ago 1 0 1 0
Preview
Meta-virus resource (MetaVR): expanding the frontiers of viral diversity with 24 million uncultivated virus genomes Abstract. Viruses are ubiquitous in all environments and impact host metabolism, evolution, and ecology, although our knowledge of their biodiversity is st

🦠🧪🧬🚨 New paper and database alert: the new IMG/VR release is now MetaVR ! We have a new website - meta-virome.org - with quick search capabilities for the >24M viruses, >12M vOTUs, and >42M protein clusters (including >790k with predicted structures !). academic.oup.com/nar/advance-...

4 months ago 64 43 1 1

This looks really good! Congratulations!

4 months ago 2 0 0 0
Metalog Metalog is a repository of manually annotated metadata (or contextual data) for metagenomic sequencing data from across the globe.

We're very happy to release our new database Metalog metalog.embl.de ! It offers manually curated and harmonised contextual data for 110k metagenomics samples across the globe, incl. precomputed taxonomic profiles, for interactive browsing and for download 🧵 1/7

#microsky

8 months ago 73 45 3 2
A stylized infographic showing the workflow for building a global soil plasmidome resource on the left and a textured world map on the right. The workflow depicts three input data streams from metagenomic datasets and isolate plasmids, which pass through steps like quality control, clustering, functional annotation, CRISPR analysis, host assignment, and detection of gene categories such as biosynthetic clusters, antimicrobial resistance, antimicrobial peptides, and CAZymes. All outputs feed into a central SQL database. The world map shows sample locations across the globe as teal circles of varying size, highlighting regions from many plasmids were recovered. Adapted from Fig 1A and 1B in Fiamenghi et al., doi:10.1038/s41467-025-65102-6

A stylized infographic showing the workflow for building a global soil plasmidome resource on the left and a textured world map on the right. The workflow depicts three input data streams from metagenomic datasets and isolate plasmids, which pass through steps like quality control, clustering, functional annotation, CRISPR analysis, host assignment, and detection of gene categories such as biosynthetic clusters, antimicrobial resistance, antimicrobial peptides, and CAZymes. All outputs feed into a central SQL database. The world map shows sample locations across the globe as teal circles of varying size, highlighting regions from many plasmids were recovered. Adapted from Fig 1A and 1B in Fiamenghi et al., doi:10.1038/s41467-025-65102-6

Soils contain an amazing diversity of functions encoded in plasmids.

The Global Soil Plasmidome Resource: 98,728 soil plasmids from 6,860 samples.

Led by @mattlabguy.bsky.social and @apcamargo.bsky.social at @jgi.doe.gov @biosci.lbl.gov @berkeleylab.lbl.gov

www.nature.com/articles/s41...

5 months ago 12 7 0 0

That's nice to hear :) Feel free to provide any feedback

5 months ago 0 0 0 0

Thanks, @acritschristoph.bsky.social!

5 months ago 1 0 0 0

@yishay.bsky.social @jimshaw.bsky.social @simrouxvirus.bsky.social @jgi.doe.gov @berkeleylab.lbl.gov

5 months ago 3 0 0 0

This project came together thanks to many amazing people (see tweet below for handles). A special thanks to Stephen Nayfach, who kicked off UHGV and helped guide it all the way through.

5 months ago 1 0 1 0
UHGV A comprehensive resource of viruses from the human gut microbiome that includes thoroughly annotated genomes, protein structures, and a novel hierarchical classification system that systematically org...

To facilitate adoption by the community, we provide online tools to allow users to explore UHGV in the browser. If you don't mind using the command line, we also provide all of the data for download :)
🌐 uhgv.jgi.doe.gov (8/8)

5 months ago 4 2 1 0
Advertisement
Post image

Taking advantage of the genomic diversity in UHGV, we used comparative genomics to examine in detail diversity-generating retroelements, methyltransferases, and endolysins, proposing mechanisms by which these functions enhance a phage’s capacity to infect new hosts. (7/8)

5 months ago 1 0 1 0
Post image

We then examined the genetic factors underlying broader host range and found that functions involved in phage-host interactions across multiple stages of the infection cycle shape a phage's ability to switch hosts. (6/8)

5 months ago 1 0 1 0
Post image

Using UHGV, we profiled thousands of human gut metagenomes and identified a subset of hyperprevalent phages found around the globe. Leveraging host prediction data, we found that these phages have markedly higher host ranges. (5/8)

5 months ago 1 0 1 0
Post image

Another key challenge in virome research is that most viruses lack taxonomic classification, leading to ad hoc approaches that hinder cross-study comparisons. To address this, we developed a taxonomy-like framework and a tool for assigning user's genomes to UHGV clusters. (4/8)

5 months ago 5 2 1 0
Post image

Viral proteins are difficult to annotate, making it hard to infer their biology from genomes. We developed a pipeline that integrates sequence- and structure-based methods to improve functional annotation and reveal novel protein domains. (3/8)

5 months ago 3 0 1 0

The Unified Human Gastrointestinal Virome (UHGV) includes 873,994 viral genomes recovered from the microbiomes of globally diverse populations. Its scale and genome quality make UHGV a valuable reference for future studies of human gut viromes worldwide. (2/8)

5 months ago 2 0 1 0
Post image

🚨New preprint out!
We present a foundational genomic resource of human gut microbiome viruses. It delivers high-quality, deeply curated data spanning taxonomy, predicted hosts, structures, and functions, providing a reference for gut virome research. (1/8)
www.biorxiv.org/content/10.1...

5 months ago 90 47 4 2

The old internet was a better place

5 months ago 1 0 1 0
Video

We're thrilled to announce SeqHub, an AI-enabled platform for biological sequence analysis. SeqHub brings together sequence search, genome annotation, and data sharing in one place.

5 months ago 49 20 3 2
Preview
GTDB release 10: a complete and systematic taxonomy for 715 230 bacterial and 17 245 archaeal genomes Abstract. The Genome Taxonomy Database (GTDB; https://gtdb.ecogenomic.org) provides a phylogenetically consistent and rank normalized genome-based taxonomy

Our @narjournal.bsky.social manuscript is out! It explores the growth of the GTDB (gtdb.ecogenomic.org) since its inception, as well as updates to the website, methodology, policies, and major taxonomic and nomenclatural changes over the past three years.

academic.oup.com/nar/advance-...

5 months ago 69 46 0 2
Advertisement
2025 BLAST NEWS — BlastNews 0.1.1 documentation

A BLAST update adding support for compressed files and csv output with headers is a Good Friday night surprise!

blast.ncbi.nlm.nih.gov/doc/blast-ne...

7 months ago 35 14 0 3