the astronauts are on the dark side quick everyone hide
Posts by Austin Richardson
💾 Prokka 1.15.6 is released!
This is the last major release of Prokka. But don't be sad, because @oschwengers.bsky.social already has an excellent replacement called Bakta you can migrate to.
#bioinformatics #microbiology #genomics
github.com/tseemann/pro...
🚨New preprint out!
We present a foundational genomic resource of human gut microbiome viruses. It delivers high-quality, deeply curated data spanning taxonomy, predicted hosts, structures, and functions, providing a reference for gut virome research. (1/8)
www.biorxiv.org/content/10.1...
When you buy a cutting board from bioinformaticians
gut fauna
Tech snow day
Apple's approach to protein structure is great for accessibility - & potentially biological realism - reasons.
Eg, prediction could be achieved w/ smaller compute & the generative nature of prediction allows for multiple conformations
A summary here: genomely.substack.com/p/simplefold...
If you're wondering why we're hosting the pre-print via dropbox, its because arXiv (and bioRxiv) did not accept it (because it is a review). Its a bit disconcerting, because a review is precisely the type of paper that would benefit a lot from pre-publication dissemination and feedback.
Closed my eyes for a sec and summoned another earthquake
they should invent a type of volatile memory that gets heavier the more data it contains
Blogged about how zstd --long fills the gap between fast and slow-but-high-ratio genome compression methods log.bede.im/2025/09/12/z...
you can just pour milk over trail mix and eat it like cereal
"You are standing in an open field west of a white house, with a boarded front door."
With 3 threads, the middle thread processes the reads starting in the middle third of the fasta file.
Little writeup on the speed of fasta parsers, at last.
Basically: both needletail and paraseq are process input linearly, and thus have a limit around 4 GB/s.
By giving each thread its own slice of the input file, we're limited by RAM bandwidth instead :)
curiouscoding.nl/posts/fasta-...
Red banner from the top of PubMed, saying "Service Alert: Planned Maintenance beginning July 25th. Most services will be unavailable for 24+ hours starting 9pm EDT. Learn more about the maintenance."
I do not enjoy that we now live in a world where seeing this banner at the top of PubMed makes me nervous.
TIL the EBV genome is *included in the hg38 assembly* so that EBV reads are not erroneously mapped elsewhere to the human genome. That's certainly .... an interesting solution ... 🤯
But it enabled this extremely cool work:
This is a bad take
stevensalzberg.substack.com/p/i-know-gen...
Saying that DNA data is like your browsing data and can can therefore be leaked is a false equivalence. Thing A is on fire so it's fine for thing B to be on fire, too-style argumentation.
Q: what do viruses and potatoes have in common?
A: both are "acellular root"
Handy to keep up with the ICTV's changes to virus taxonomy and species names:
taxonomy.onecodex.com/taxon/694009...
vs:
taxonomy.onecodex.com/taxon/694009...
or
taxonomy.onecodex.com/taxon/11676/...
vs
taxonomy.onecodex.com/taxon/11676/...
Are you attending #ASMicrobe this is week? Stop by my talk on Friday morning (10AM) and say hello! 👋 if you can’t make it and want to meet up - just drop me a DM!
I love this meeting and connecting with so many friends and colleagues over the years has made it really a special meeting.
🌳 Taxonomy Time Machine now supports batch lookups! Quickly resolve lists of names/TaxIDs to their current NCBI taxonomy → taxonomy.onecodex.com/bulk-resolver
🚀 Pushed some updates to taxonomy.onecodex.com
- Example queries to help you get started
- Summary section for easier interpretation
- Perf. improvements
🧵 The ATCC Genome Portal hit 5,500 authenticated microbial genomes (>2,600 species)! 🎉🥳 We've sequenced, assembled, annotated 4,538 bacteria, 479 viruses, 479 fungi, and 4 protists! All NGS in-house @ ATCC under ISO, and >90% on BOTH @nanoporetech.com and #Illumina 😎 www.atcc.org/applications...
Something happened to my $PATH and now nothing works
Trisolarans: “the Sophons have succeeded in disrupting science”
Bad day for VCF files
It's clearly a DNS issue, but overall, the NCBI is the least reliable I've ever experienced in my career. And I'm in this long enough to remember the Entrez API giving you only part of the file every 50-100th time.
using github copilot to fail at github workflows aka boiling the ocean
I call it the London Smaug (Tension Tamer + espresso latte)
Join us at SNU and gain access to leading-edge hardware: m.youtube.com/watch?v=ztth...