GTDB release 11 based on RefSeq 232 (R11-RS232) is live at gtdb.ecogenomic.org. This release covers 901,341 genomes (23% increase) and has 199,923 species clusters (39% increase). Release notes at: forum.gtdb.ecogenomic.org/t/announcing.... Release statistics at: gtdb.ecogenomic.org/stats/r232.
Posts by Wei Shen 沈 伟
Modern DRAM is based on a brilliant design from IBM.
But, we're still paying for a latency penalty that's existed since the 60s!
In this video, I'm introducing my research project (Tailslayer) that immensely reduces p99.99 latency on traditional RAM!
LexicMap v0.9.0 has been released with
- a few bug fixes: CIGAR, bitscore and evalue calculation.
- new features: better support for big genomes like human.
- a new command to convert output to SAM format
github.com/shenwei356/L...
Thank you! Its original goal was to help beginners do some simple but common tasks. Before that, I shared Perl and Python scripts to help students, but they struggled to install the Perl modules. At that time, Go 1.0 was released, with a great feature of easily providing executable binaries.
Thank you, James! Glad you like it.
Can't wait to release a 10-year-old birthday version for SeqKit!
- 10 years
- 2 papers, 3500 citations
- 20 contributors
- 40 subcommands
- 880 commits
- 500 issues
- 685.5K Bioconda total downloads
Thank you all, dear contributors and users!
I'll keep maintaining it.
github.com/shenwei356/s...
I should have said, this takes us to about 2.8 million genomes in total. We don't have annotations, etc for the latest data yet, this will be an ongoing process
A long time ago in a galaxy far away, there was a SARS-CoV-2 pandemic. Our paper, led by @martibartfast.bsky.social
a) correcting errors in 4.5 million genomes & their phylogeny
b) improving representation of the Global South in public data
www.nature.com/articles/s41...
(thread 1/n)
> We will change several names that have long been in use (e.g., Firmicutes, Proteobacteria) to newly formalized names (e.g., Bacillota, Pseudomonadota) that may be unfamiliar to some.
ncbiinsights.ncbi.nlm.nih.gov/2022/11/14/p...
Phold's manuscript is now available @narjournal.bsky.social thanks to @susiegriggo.bsky.social @npbhavya.bsky.social @vijinim.bsky.social @linsalrob.bsky.social @martinsteinegger.bsky.social @milot.bsky.social @eunbelivable.bsky.social & others not on bsky #phagesky academic.oup.com/nar/article/...
For those of us interested in software development, data structure design etc in science, this is a must-read. A taste of what is happening in communities letting AI agents go wild writing code, creating PRs, writing documentation : spoiler - humans get addicted, lose perspective, slop everywhere.
Phage therapy of perinephric abscess in kidney transplantation recipients caused by drug‐resistant Pseudomonas aeruginosa - Liu - 2025 - mLife - Wiley Online Library onlinelibrary.wiley.com/doi/10.1002/...
The scikit-bio paper in online in Nature Methods! Many thanks to our collaborators, community contributors and reviewers! We couldn’t have done it without you. www.nature.com/articles/s41... #Bioinformatics #OpenSource
So did I, but I'm serious this time.
Yes, finally! I bought the first Rust Book in 2019 ...
Performance on reading and writing plain FASTA/Q files
My first Rust toy tool (just for learning purposes)!!!
The performance (both time and memory) is good!
Rust is hard to learn, and I still need more practice!
github.com/shenwei356/f...
Most exciting study have seen for ages, and Fernando the most excited speaker. Much anticipated. Highly recommended, a lot of food for thought (and quite a dense paper - lots to think about)
MVIF 44
It's Monday!
...and a new #MVIF program is out! 🤩
Free registration: cassyni.com/s/mvif-44
⭐️ Highlights:
🇺🇸 Vanessa Hale
🇰🇷 Jun Hyung Cha
⭐️ Keynote:
🇺🇸 Katherine Lemon @kathlemon.bsky.social
⭐️ Talks:
🇺🇸 Meenakshi Chakraborty
🇨🇳 Wei Shen @shenwei356.bsky.social
🇺🇸 Johanna Gutleben
📣 New preprint from us at phagefoundry.org 📣
A solid machine learning framework & to predict strain-level phage-host interactions across diverse bacterial genera from genome sequences alone. Avery Noonan from the Arkin Lab led this massive effort
www.biorxiv.org/content/10.1...
Honoured and quite blown-over to receive this award. I have been, and continue to be, very lucky - first with great mentors, and then really prodigious students, postdocs and collaborators. Working with them has been a joy.
Our method for genome size estimation from long-read overlaps is now published 🥳
academic.oup.com/bioinformati...
Thread on #GI2025 's second day! 👇🏻
Ben Langmead @benlangmead.bsky.social delivers the official opening for this year's Genome Informatics Conference #GI2025 at Cold Spring Harbor Laboratory.
List of talks and posters: meetings.cshl.edu/abstracts.as...
Cool paper new paper from Lorién López-Villellas, @santiagomarco.bsky.social and others!
Super cute and simple idea:
In Gotoh's affine-cost alignment, only the M matrix is needed during tracing: we can just search for a gap-length x such that M[i][j] = M[i-x][j]+o+x*e or M[i][j] = M[i][j-x]+o+x*e.
I also have serious concerns about the consolidation of roles (one person is now publisher, chief editor, and also a frequent author) as exemplified in a recent paper that was fast-tracked for publication.
Really exciting that the preprint on Barbell, a new demultiplexer, is finally out!
It's the first tool that builds on Sassy, the approximate-DNA-searching tool that @rickbitloo.bsky.social and myself developed earlier this year, specifically with this application in mind.
Around 10% of your Nanopore reads (SQK-RBK114) are incorrectly trimmed. Here is why, and how our new tool Barbell solves it:
www.biorxiv.org/content/10.1...
Want to get started? github.com/rickbeeloo/b...
1/6 Movi 2 is here: faster and more space-efficient for pangenome queries. Its fastest mode uses half the memory of Movi 1 while running ~30% faster. github.com/mohsenzakeri...
Podcast with me and @turiking.bsky.social for the @milnerevolution.bsky.social series, on plasmid evolution over the last 100 years, talking about our ( @cazares-adr.bsky.social , Nick Thomson, @sarah1alexander.bsky.social & co) recent paper www.science.org/doi/10.1126/...
youtu.be/Mzr3TD4ijs0?...
Preprint out for myloasm, our new nanopore / HiFi metagenome assembler!
Nanopore's getting accurate, but
1. Can this lead to better metagenome assemblies?
2. How, algorithmically, to leverage them?
with co-author Max Marin @mgmarin.bsky.social, supervised by Heng Li @lh3lh3.bsky.social
1 / N