10 years after the first FAMSA paper, its successor is now published in Nat Biotech! We believe that FAMSA2 can enable analyses of large protein collections that were previously unattainable. Thank you, Andrzej and Cedric, for great collaboration
www.nature.com/articles/s41...
Posts by Milot Mirdita
Metabuli & Metabuli App v1.2 improve novel species classification with higher precision and recall. New light mode is 1.8× faster and requires 50% less storage while keeping precision. New RefSeq, GTDB, HRGM, and HROM databases added.
💾 github.com/steineggerla...
📄 doi.org/10.64898/2026.03.13.711249
Whenever I presented Phold, I was frequently asked "can you do the same beyond phages?" We ( @oschwengers.bsky.social @linsalrob.bsky.social @binomicalabs.org et al) finally did it with Baktfold github.com/gbouras13/ba... www.biorxiv.org/content/10.6...
45 novel protein folds in the updated AFESM (AFDB + ESMatlas) manuscript:
• 12 high-confidence folds in AFESM
• 33 by ColabFold-repredicting 2.3M low-quality domains
We show AFDB captures most domains already and ESMfold struggles with novelty
🌏 afesm.foldseek.com
📄 biorxiv.org/content/10.1...
Baktfold: Sensitive protein functional annotation across the microbial tree of life using structural information www.biorxiv.org/content/10.64898/2026.03...
the web application is available at :
https://brigx.genomicx.org/
I would be interested to hear feedback regarding bugs or any unexpected behaviour, any suggestions for the user interface, or any features you feel would be useful .
How much protein diversity can Life on Earth actually generate?
With DIAMOND DeepClust, we show how billions of proteins across the tree of life can be clustered at low-identity for downstream analytics tasks.
📚Paper: www.nature.com/articles/s41...
💻Code: github.com/bbuchfink/di...
My group at MIT is seeking a research scientist with a strong *experimental* background to lead and help shape the lab’s experimental infrastructure, supporting efforts to advance AI-driven enzyme discovery and characterization.
See the full JD here: acrobat.adobe.com/id/urn:aaid:...
Meet evedesign: open-source AI, accessible protein design
✅Combine models for multiobjective optimization
✅Integrate experimental data
✅ Run on your own infrastructure
📄Paper: www.biorxiv.org/content/10.6...
💻Code: github.com/evedesignbio
🌐Webserver: evedesign.bio
Collaborate: hello@evedesign.bio
The MMCA always has interesting exhibitions and the area between it and Anguk has a really nice vibe (even ignoring the major tourist hotspots of the palace and Bukchon village).
A bit more offbeat: there are a few archery cafes, if you want to try a new sport.
Cherry blossom season is starting very soon (in ~1 week south, ~2 in Seou). Yeuido (around the National Assembly) is pretty good for 🌸 but will also get quite busy.
Taking an walk along Cheonggyecheon stream or along the Han river is always nice (e.g,. Nodeul island, Banpo bridge).
@ecallaway.bsky.social wrote a news article on our AlphaFold complex work. Thank you for covering it.
📄 www.nature.com/articles/d41...
AlphaFold database has entered the era of complexes. Together with NVIDIA, DeepMind and EBI, we use ColabFold, OpenFold and MMseqs2-GPU to predict ~31 million complexes (homo & hetro-dimers) resulting in 1.8 million high-quality predictions
📄 research.nvidia.com/labs/dbr/ass...
🌐 alphafold.ebi.ac.uk
You asked, we listened. Millions of AI-predicted protein complex structures are now available in the #AlphaFold Database.
This spans homodimers from 20 of the most studied species, including humans, as well as the World Health Organization’s priority pathogens list.
www.ebi.ac.uk/about/news/t...
Efficient protein structure prediction fromcompact computers to datacenters withOpenFold-TRT www.biorxiv.org/content/10.64898/2026.03...
ProteinTTT is now easy to run on Hugging Face Spaces and Google Colab. We’ll also be presenting the paper at ICLR 2026 🇧🇷
🤗 Hugging Face Space: huggingface.co/spaces/pimen...
⚙️ Google Colab: colab.research.google.com/drive/1l_h7c...
🧵👇
Two-panel calibration plot (two benchmark dimer datasets) comparing predicted interchain contact-probability bins (x-axis) with the observed fraction of native interfacial contacts (y-axis). Points follow the diagonal, indicating close agreement between predicted probabilities and true interface-contact fractions.
My first manuscript in MPI colours! With @tothpetroczylab.bsky.social, we show that AlphaFold PAE-derived contact probabilities are well calibrated to the fraction of true interface contacts across experimentally determined protein dimers.
www.biorxiv.org/content/10.6...
Can't wait to release a 10-year-old birthday version for SeqKit!
- 10 years
- 2 papers, 3500 citations
- 20 contributors
- 40 subcommands
- 880 commits
- 500 issues
- 685.5K Bioconda total downloads
Thank you all, dear contributors and users!
I'll keep maintaining it.
github.com/shenwei356/s...
At the 132nd Internat. Titisee Conference on Biology 2.0: The AI Revolution in Biology & Medicine
From sequence→function models 🧬
to protein & generative structure models 🧪
to AI of cell states & perturbations 🧫
Great science, great friends, beautiful lake. Thanks @BIFonds!
New version of our preprint on bioRxiv about bioRxiv up. Now that’s what I call a revision – 6 years after the first version!
It has new data about our progress and highlights from a massive user survey. 1/n
www.biorxiv.org/content/10.1...
Can we simulate realistic evolutionary trajectories and “replay the tape of life”? In this work, we propose a flexible, generalizable deep learning framework for modeling how the entire protein sequence evolves over time while capturing complex interactions across sites. 1/n
doi.org/10.64898/202...
Our new review on genome annotation just appeared in @naturerevgenet.bsky.social, with a particular focus on the human genome, with Hayden Ji and Mihaela Pertea: rdcu.be/e4mI1
Introducing The Structural History of Eukarya (SHE): The first proteome-scale phylogeny constructed entirely from 3D structure.
We computed 300 trillion alignments across 1,542 species to map the tree of life. 🧵👇 (1/5)
Please spread the word:
We invite applications to a two-week Computational Biology workshop in Singapore, June 14-27.
This NSF-funded workshop brings together 16-20 US grad students with international peers.
Apply by March 21: compbioasia.net
🧵 Details below:
Distance-Restraint-Guided Diffusion Models for Sampling Protein Conformational Changes and Ligand Dissociation Pathways
Tatsuki Hori, Yoshitaka Moriwaki, Ryuichiro Ishitani
www.biorxiv.org/content/10.6...
Our new preprint is out.
FoldMason is out now in @science.org. It generates accurate multiple structure alignments for thousands of protein structures in seconds. Great work by Cameron L. M. Gilchrist and @milot.bsky.social.
📄 www.science.org/doi/10.1126/...
🌐 search.foldseek.com/foldmason
💾 github.com/steineggerla...
Can ever-increasing sequence databases improve phylogenetic reconstruction of a gene family? Our new preprint introduces AmpliPhy, a pipeline that automates homolog enrichment to improve gene tree inference, built on a robust phylogenomic benchmark scheme. 🧵1/n
📃 doi.org/10.64898/2026.01.26.701724
Milot’s venture into establishing his own lab is incredibly excitinge. I highly recommend to join Milot on his mission to advance molecular biology through open-source bioinformatics.
My time in @martinsteinegger.bsky.social's group is ending, but I’m staying in Korea to build a lab at Sungkyunkwan University School of Medicine. If you or someone you know is interested in molecular machine learning and open-source bioinformatics, please reach out. I am hiring!
mirdita.org