We introduce ConforNets, a mechanism for conformational control in AlphaFold3 models
- SoTA at producing diverse conformations on every multistate benchmark (N=104)
- Novel capability: transfer state from one protein to another
Outperforms BioEmu, ConforMix and AFsample3
🧵1/8
Posts by Martin Steinegger 🇺🇦
The CASP experiment is about to start - and is still short on challenging prediction targets. Ligand complexes, RNA, …
Congratulations! However, I’m visiting the NIH next month, so it’s a month too early. 😥
10 years after the first FAMSA paper, its successor is now published in Nat Biotech! We believe that FAMSA2 can enable analyses of large protein collections that were previously unattainable. Thank you, Andrzej and Cedric, for great collaboration
www.nature.com/articles/s41...
So happy to see it published
(1/2) Interested in plant evolution? We are opening two PhD positions in my lab at QGG - Aarhus University, where you will combine comparative genomics and machine learning to better understand and improve crop traits.
#PlantSciJobs #evolution #genomics
Metabuli & Metabuli App v1.2 improve novel species classification with higher precision and recall. New light mode is 1.8× faster and requires 50% less storage while keeping precision. New RefSeq, GTDB, HRGM, and HROM databases added.
💾 github.com/steineggerla...
📄 doi.org/10.64898/2026.03.13.711249
I will probably be hiring either a PhD or a Postdoc in the near future. If you are interested in deep biochemical history or molecular mechanisms of evolution please get in touch.
45 novel protein folds in the updated AFESM (AFDB + ESMatlas) manuscript:
• 12 high-confidence folds in AFESM
• 33 by ColabFold-repredicting 2.3M low-quality domains
We show AFDB captures most domains already and ESMfold struggles with novelty
🌏 afesm.foldseek.com
📄 biorxiv.org/content/10.1...
AI is powerful, but hype gives it pseudoscience vibes.
When claims are unfalsifiable, detached from reality, or driven by tools looking for problems, we risk undermining real progress.
Overselling AI does not accelerate science: It erode trust.
stevensalzberg.substack.com/p/ai-is-star...
The fun fact, we discussed the name mirage for evedesign.
A run-length-compressed skiplist data structure for dynamic GBWTs supports time and space efficient pangenome operations over syncmers
doi.org/10.64898/202...
My group at MIT is seeking a research scientist with a strong *experimental* background to lead and help shape the lab’s experimental infrastructure, supporting efforts to advance AI-driven enzyme discovery and characterization.
See the full JD here: acrobat.adobe.com/id/urn:aaid:...
_720 Gbp_ marine nanopore metagenome -> 328 circular prokaryotic contigs: using myloasm!
Insane work by Lui and Nielsen. Also shows how modern long read assemblies can disentangle coexisting strains and reveal ecological insights.
Uh oh, I really do not know. 😅
I think it’s a great idea. We should implement them.
Check out this awesome work from @daniil-litvinov.bsky.social: Protein complex stoichiometry prediction (both homomers and heteromers) from sequence, with some nice ablations showing what makes the difference!
I assume they did not end up having a high confidence. Do you have some example ids?
Soon we will have a Foldseek multimer search, @milot.bsky.social is working on this with @sooyoung-cha.bsky.social!
Thank you for sharing. It’s pretty interesting that pdockq2 is so low! We will look at this case.
Kieran et al. trained a generative protein binder design model. The training is based on Teddymer, a dataset developed by @sooyoung-cha.bsky.social. By treating monomer domains as multimers and clustering them with Foldseek, she created a set that allowed Complexa to learn.
💾 teddymer.foldseek.com
@ecallaway.bsky.social wrote a news article on our AlphaFold complex work. Thank you for covering it.
📄 www.nature.com/articles/d41...
Yes, we did for now only compute di-mers.
Sensitive and scalable metagenomic classification using spaced metamers, reduced alphabets, and syncmers www.biorxiv.org/content/10.64898/2026.03...
This was a fantastic collaboration with many contributors, including @yewonhan.bsky.social, @mitsenkov.bsky.social, Nick Venanzi, Sameer Velankar, Jennifer Fleming, @milot.bsky.social and @machine.learning.bio et. al.
A small number of large clusters capture most of the structural space: the top 1% of representatives cover ~25% of entries, and the top 20% cover ~82%. At the same time, clusters without a PDB multimer match are enriched among smaller clusters, suggesting much of the unexplored interaction. 4/
In the ~8M heterodimer set, prediction success is highest for pairs that are more alike: higher inter-chain sequence identity and smaller length differences both increase the rate of high-confidence models. 3/
Scale alone doesn’t determine success. Although most predictions come from eukaryotic proteomes, the highest fraction of high-confidence complexes is observed in bacteria and archaea. 2/
A key takeaway: this resource dramatically expands the known structural interactome. In nearly every proteome, the number of high-confidence complex predictions surpasses experimental multimer structures in the PDB by one to three orders of magnitude. 1/
You asked, we listened. Millions of AI-predicted protein complex structures are now available in the #AlphaFold Database.
This spans homodimers from 20 of the most studied species, including humans, as well as the World Health Organization’s priority pathogens list.
www.ebi.ac.uk/about/news/t...