Advertisement · 728 × 90

Posts by Martin Steinegger 🇺🇦

Video

We introduce ConforNets, a mechanism for conformational control in AlphaFold3 models

- SoTA at producing diverse conformations on every multistate benchmark (N=104)
- Novel capability: transfer state from one protein to another

Outperforms BioEmu, ConforMix and AFsample3

🧵1/8

2 days ago 39 9 1 2

The CASP experiment is about to start - and is still short on challenging prediction targets. Ligand complexes, RNA, …

2 days ago 17 16 1 1

Congratulations! However, I’m visiting the NIH next month, so it’s a month too early. 😥

3 days ago 2 0 1 0
Preview
Fast and accurate multiple-protein-sequence alignment at scale with FAMSA2 - Nature Biotechnology FAMSA2 accurately aligns millions of protein sequences at high speed.

10 years after the first FAMSA paper, its successor is now published in Nat Biotech! We believe that FAMSA2 can enable analyses of large protein collections that were previously unattainable. Thank you, Andrzej and Cedric, for great collaboration
www.nature.com/articles/s41...

1 week ago 56 22 3 2

So happy to see it published

1 week ago 1 0 0 0
Post image

(1/2) Interested in plant evolution? We are opening two PhD positions in my lab at QGG - Aarhus University, where you will combine comparative genomics and machine learning to better understand and improve crop traits.
#PlantSciJobs #evolution #genomics

1 week ago 7 10 1 0
Post image

Metabuli & Metabuli App v1.2 improve novel species classification with higher precision and recall. New light mode is 1.8× faster and requires 50% less storage while keeping precision. New RefSeq, GTDB, HRGM, and HROM databases added.
💾 github.com/steineggerla...
📄 doi.org/10.64898/2026.03.13.711249

2 weeks ago 30 18 1 0
Advertisement

I will probably be hiring either a PhD or a Postdoc in the near future. If you are interested in deep biochemical history or molecular mechanisms of evolution please get in touch.

2 weeks ago 23 23 0 1
AFESM Clusters Foldseek clustered 820M AlphaFold DB + ESMatlas structures

45 novel protein folds in the updated AFESM (AFDB + ESMatlas) manuscript:
• 12 high-confidence folds in AFESM
• 33 by ColabFold-repredicting 2.3M low-quality domains
We show AFDB captures most domains already and ESMfold struggles with novelty
🌏 afesm.foldseek.com
📄 biorxiv.org/content/10.1...

2 weeks ago 20 9 1 0
Preview
AI badly needs a dose of skepticism Some scientists are too eager to believe their own claims

AI is powerful, but hype gives it pseudoscience vibes.

When claims are unfalsifiable, detached from reality, or driven by tools looking for problems, we risk undermining real progress.

Overselling AI does not accelerate science: It erode trust.

stevensalzberg.substack.com/p/ai-is-star...

3 weeks ago 38 10 2 1

The fun fact, we discussed the name mirage for evedesign.

3 weeks ago 1 0 0 0
Post image Post image Post image

A run-length-compressed skiplist data structure for dynamic GBWTs supports time and space efficient pangenome operations over syncmers
doi.org/10.64898/202...

3 weeks ago 17 5 0 0
Adobe Acrobat

My group at MIT is seeking a research scientist with a strong *experimental* background to lead and help shape the lab’s experimental infrastructure, supporting efforts to advance AI-driven enzyme discovery and characterization.

See the full JD here: acrobat.adobe.com/id/urn:aaid:...

4 weeks ago 16 16 1 0

_720 Gbp_ marine nanopore metagenome -> 328 circular prokaryotic contigs: using myloasm!

Insane work by Lui and Nielsen. Also shows how modern long read assemblies can disentangle coexisting strains and reveal ecological insights.

4 weeks ago 47 13 2 0
Advertisement

Uh oh, I really do not know. 😅

1 month ago 2 0 1 0

I think it’s a great idea. We should implement them.

1 month ago 1 0 0 0

Check out this awesome work from @daniil-litvinov.bsky.social: Protein complex stoichiometry prediction (both homomers and heteromers) from sequence, with some nice ablations showing what makes the difference!

1 month ago 13 3 0 0

I assume they did not end up having a high confidence. Do you have some example ids?

1 month ago 0 0 1 0

Soon we will have a Foldseek multimer search, @milot.bsky.social is working on this with @sooyoung-cha.bsky.social!

1 month ago 3 0 1 0

Thank you for sharing. It’s pretty interesting that pdockq2 is so low! We will look at this case.

1 month ago 0 0 1 0

Kieran et al. trained a generative protein binder design model. The training is based on Teddymer, a dataset developed by @sooyoung-cha.bsky.social. By treating monomer domains as multimers and clustering them with Foldseek, she created a set that allowed Complexa to learn.
💾 teddymer.foldseek.com

1 month ago 29 4 0 0
Preview
AlphaFold hits ‘next level’: the AI database now includes protein pairing The database of 200 million protein-structure predictions now includes homodimers, adding new biological relevance.

@ecallaway.bsky.social wrote a news article on our AlphaFold complex work. Thank you for covering it.

📄 www.nature.com/articles/d41...

1 month ago 19 6 0 0

Yes, we did for now only compute di-mers.

1 month ago 1 0 0 0

Sensitive and scalable metagenomic classification using spaced metamers, reduced alphabets, and syncmers www.biorxiv.org/content/10.64898/2026.03...

1 month ago 13 8 0 1
Advertisement

This was a fantastic collaboration with many contributors, including @yewonhan.bsky.social, @mitsenkov.bsky.social, Nick Venanzi, Sameer Velankar, Jennifer Fleming, @milot.bsky.social and @machine.learning.bio et. al.

1 month ago 6 0 0 0
Post image

A small number of large clusters capture most of the structural space: the top 1% of representatives cover ~25% of entries, and the top 20% cover ~82%. At the same time, clusters without a PDB multimer match are enriched among smaller clusters, suggesting much of the unexplored interaction. 4/

1 month ago 4 0 1 0
Post image

In the ~8M heterodimer set, prediction success is highest for pairs that are more alike: higher inter-chain sequence identity and smaller length differences both increase the rate of high-confidence models. 3/

1 month ago 4 0 1 0
Post image

Scale alone doesn’t determine success. Although most predictions come from eukaryotic proteomes, the highest fraction of high-confidence complexes is observed in bacteria and archaea. 2/

1 month ago 10 3 2 0
Post image

A key takeaway: this resource dramatically expands the known structural interactome. In nearly every proteome, the number of high-confidence complex predictions surpasses experimental multimer structures in the PDB by one to three orders of magnitude. 1/

1 month ago 5 0 1 0
Post image Post image Post image Post image

You asked, we listened. Millions of AI-predicted protein complex structures are now available in the #AlphaFold Database.

This spans homodimers from 20 of the most studied species, including humans, as well as the World Health Organization’s priority pathogens list.

www.ebi.ac.uk/about/news/t...

1 month ago 157 86 7 4