Kieran Didi (@kdidi) Bsky

mosaic/examples/proteina.py at main · escalante-bio/mosaic composite-objective protein design. Contribute to escalante-bio/mosaic development by creating an account on GitHub.

Added a JAX translation of the excellent Proteina-Complexa (from nvidia, @kdidi.bsky.social , @karstenkreis.bsky.social ) to mosaic. You can do beam search with any mosaic loss (e.g. protenix + mpnn) and JAX with generate efficient GPU/TPU code.

3 weeks ago 13 6 1 0

My lab was lucky to be able to help test Proteina-Complexa binders experimentally. We were blown away by the results. Congrats to our colleagues at NVIDIA!

1 month ago 5 1 0 0

Efficient protein structure prediction fromcompact computers to datacenters withOpenFold-TRT www.biorxiv.org/content/10.64898/2026.03...

1 month ago 12 7 0 0

Also checkout the great post from @karstenkreis.bsky.social about our work! bsky.app/profile/kars...

1 month ago 0 0 0 0

That's it from me! Happy to hear what people think once they try it out (code weights etc all available under permissive license!) 19/19

1 month ago 0 0 1 0

The whole Proteina effort started more than 2 years ago when Arash Vahdat believed in me and my two partners in crime Tomas Geffner and @karstenkreis.bsky.social to start this crazy protein thing at NVIDIA, unreal to see how far we have come since then as a field. 18/n

1 month ago 0 0 1 0

Case study 5/5: first fully de novo Carbohydrate binders

We target blood-group B antigen and got a a 21% hit rate.
Massive shoutout to my long-term enzyme collaborator @mattpenner.bsky.social at Cambridge for powering this through, excited about what codesign will unlock for enzymes! 17/n

1 month ago 1 0 1 0

Case study 4/5: Nipah virus
In the Adaptyv binder competition setting, Proteina-Complexa produced a nanomolar binder (56 nM) to NiV-G, targeting a recessed receptor-binding site. We also show how to go beyond partial diffusion and perform joint sequence-structure redesign 16/n

1 month ago 0 0 1 0

Case study 3/5: Targeting Kinases (PAK1 + CK1δ) with peptide binders

Across mini-protein and short-peptide regimes (<31 aa peptides and 49–74 aa mini-binders), we observed strong hit rates on difficult target sites. 15/n

1 month ago 0 0 1 0

Case study 2/5: ActRIIA, a target for GLP-1 associated muscle wasting
We designed de novo binders that block myostatin signaling in cells. Tightest binder reached KD = 36 nM, with functional downstream inhibition. 14/n

1 month ago 1 0 1 0

Case study 1/5: PDGFR
We achieved a 63.5% hit rate, with top binders reaching double-digit picomolar affinity (best reported at 93.6 pM). Very cool collab with @novonordisk.bsky.social
13/n

1 month ago 0 0 1 0

As part of this campaign, we also ran a the biggest head-to-head wetlab comparison of design methods ever. Proteina-Complexa outperformed baselines in this setting. My personal highlight: we are the only method where codesign shines, outperforming MPNN for the first time! 12/n

1 month ago 0 0 1 0

Massive campaign: ~1M designed candidates screened across 127 diverse/challenging targets with all-to-all binding readouts. We solve more than 2/3 of the targets given a very limited compute budget and often quite challenging hotspots and crops 11/n

1 month ago 0 0 1 0

That’s the method side (come to ICLR in Rio to hear more about at our Oral!). But in silico only goes so far, so now the wet-lab side!
This was a major collaboration with @manifoldbio.bsky.social , @vivabiotech, @novonordisk.bsky.social , @CambridgeUniversity, @DukeUniversity, @lmu.de
. 10/n

1 month ago 0 0 1 0

Quantitatively, these inference-time scaling strategies outperform prior hallucination-based methods under normalized compute budgets as well as other generative models, setting a new SOTA in in-silico binder design. 9/n

1 month ago 0 0 1 0

One of my highlights: because search is reward-guided, we can optimize for biophysical objectives during generation - including interface hydrogen-bond terms that promote denser interaction networks. With MLFF improvements being rapid, this becomes more and more powerful! 8/n

1 month ago 0 0 1 0

Data was also key: with @martinsteinegger.bsky.social and @sooyoung-cha.bsky.social lab we introduce Teddymer, a large synthetic binder-target pretraining resource built from domain-domain interactions derived from AFDB monomers (details + links on the project page). 7/n

1 month ago 2 0 1 0

In practice, this means scaling inference-time compute using strategies like beam search, MCTS, and Feynman–Kac steering to improve candidate quality and physical plausibility. Force field metrics, interaction energies, folding scores, you choose you reward! 6/n

1 month ago 2 0 1 0

We then combine the strengths of generative models with optimisation methods like hallucination. We call the inference recipe latent generative search: combine a learned generative prior with search at test time to steer toward better binders. 5/n

1 month ago 0 0 1 0

Core design choice:
- joint sequence-structure generation in a partially latent atomistic framework (building on La-Proteina).
- No discrete sequence tokenization loop.
- No mandatory post hoc inverse-folding redesign step, the first codesign model that outperforms MPNN! 4/n

1 month ago 0 0 1 0

Proteina-Complexa is a generative binder design framework that supports diverse targets: single-chain proteins, multi-chain complexes, and small-molecule binding contexts. 3/n

1 month ago 0 0 1 0

Two papers, one story:
1) ICLR 2026 Oral: method + core modeling advances
2) Experimental validation: large-scale wet-lab evidence across many targets and campaigns
Let’s start with the method paper. 2/n

1 month ago 0 1 1 0

📢 We’re launching Proteina-Complexa — and after the Jensen keynote mention, we definitely had to post this thread now ;)
Atomistic binder design with generative pretraining + test-time compute, plus large-scale wet-lab validation.
Project page: research.nvidia.com/labs/genair/...
🧵 1/n

1 month ago 37 16 1 3

The unification of representation learning and generative modelling A deep dive into the convergence of discriminative and generative AI, covering 4 phases of evolution from REPA to RAE and beyond.

Too many REPA / RAE / representation alignment papers lately?
I was lost too, so I wrote a blog post that organizes the space into phases and zooms in on what actually matters for general/molecular ML.
Curious what folks think - link below!

🔗 Blog: kdidi.netlify.app/blog/ml/2025...

2 months ago 11 2 0 0

GPU-accelerated homology search with MMseqs2 - Nature Methods Graphics processing unit-accelerated MMseqs2 offers tremendous speedups for homology retrieval from metagenomic databases, query-centered multiple sequence alignment generation for structure predictio...

GPU-accelerated MMseqs2 offers tremendous speedup for homology retrieval, protein structure prediction with ColabFold, and protein structure search with Foldseek. @martinsteinegger.bsky.social @milot.bsky.social @machine.learning.bio

www.nature.com/articles/s41...

7 months ago 81 21 0 0

GPU-accelerated homology search with MMseqs2 - Nature Methods Graphics processing unit-accelerated MMseqs2 offers tremendous speedups for homology retrieval from metagenomic databases, query-centered multiple sequence alignment generation for structure predictio...

MMseqs2-GPU sets new standards in single query search speed, allows near instant search of big databases, scales to multiple GPUs and is fast beyond VRAM. It enables ColabFold MSA generation in seconds and sub-second Foldseek search against AFDB50. 1/n
📄 www.nature.com/articles/s41...
💿 mmseqs.com

7 months ago 174 63 4 2

For more details read the thread by the man himself: bsky.app/profile/ncor...

8 months ago 1 0 0 0

An incredible project to witness, led by the most incredible dreamteam @ncorley.bsky.social , @simonmathis.bsky.social and Rohith Krishna with an amazing team inside and outside the Baker lab. Check it out and let us know what you think/contribute to the codebase! 6/6

8 months ago 1 0 1 0

The preprint shows how atomworks leads to better reference conformers (and better predictions!), enables advanced features in RF3 like chirality-aware training or ligand templating and narrows the performance gap to closed-source models. 5/6

8 months ago 0 0 1 0

`atomworks.ml` on the other hand offers advanced dataset featurization and sampling for deep learning workflows, all operating on the canonical AtomArray object from @biotite_python so that all transforms are traceable and generalizable between models. 4/6

8 months ago 2 1 1 0

Posts by Kieran Didi