Karsten Kreis (@karstenkreis) Bsky

mosaic/examples/proteina.py at main · escalante-bio/mosaic composite-objective protein design. Contribute to escalante-bio/mosaic development by creating an account on GitHub.

Added a JAX translation of the excellent Proteina-Complexa (from nvidia, @kdidi.bsky.social , @karstenkreis.bsky.social ) to mosaic. You can do beam search with any mosaic loss (e.g. protenix + mpnn) and JAX with generate efficient GPU/TPU code.

3 weeks ago 13 6 1 0

🔸 Find all details, links to papers, model weights and code, and link to the Teddymer dataset on our project page.

🔥 Project page: research.nvidia.com/labs/genair/...

Finally, please check Kieran's (@kdidi.bsky.social) great thread on Proteina-Complexa, too: bsky.app/profile/kdid...

1 month ago 0 0 1 0

🔸 Grateful to our amazing academic and industry partners for this exciting collaboration!

@manifoldbio.bsky.social, @novonordisk.bsky.social, Viva Biotech, Duke University (Soderling Lab), Cambridge University (Hollfelder Lab), @lmu.de (Khmelinskaia Group), @SeoulNatlUni (Steinegger Lab)

(19/n)

1 month ago 1 0 1 0

🔸This was an effort by a brilliant team at NVIDIA & partners

NVIDIA shoutouts: bsky.app/profile/kdid..., Danny Reidenbach, Zuobai Zhang, Guoqing Zhou, Zhonglin Cao, Tomas Geffner, Micha Livne, @machine.learning.bio, Emine Kucukbenli, @arashv.bsky.social

A privilege working with this team.

(18/n)

1 month ago 0 0 1 0

🔸 A key highlight: We achieved de novo design of carbohydrate binders. We targeted the Blood Group B antigen, reaching a 21% hit rate.

To the best of our knowledge, no prior computational methods have achieved de novo design against these challenging polar targets.

(17/n)

1 month ago 1 0 1 0

🔸 Targeting Viruses: As part of the recent Adaptyv binder competition, we used Proteina-Complexa to design a nanomolar binder (56 nM) against the Nipah virus G protein, successfully targeting its recessed receptor-binding site (see visualization of the experimental hit).

(16/n)

1 month ago 0 0 1 0

🔸 Kinase targets and varying binder size:

Spanning two distinct size regimes, we designed peptide (<31 amino acids) and miniprotein binders (49–74 amino acids) for kinase targets like CK1δ and PAK1. Proteina-Complexa achieved 40-50% hit rates for these difficult motifs.

(15/n)

1 month ago 0 0 1 0

🔸 Activin Binders:

Next, we designed de novo binders for the Activin receptor type IIA (ActRIIA) that block myostatin signaling in cells. Our tightest binder showed a KD of 36 nM and functional inhibition in downstream experiments.

(14/n)

1 month ago 1 0 1 0

🔸 Aside from the massive screen, we conducted case studies of different targets in separate experiments.

For instance, we generated de novo binders for PDGFR: We achieved a very high 63.5% hit rate, with top candidates reaching double-digit picomolar affinity (93.6 pM).

(13/n)

1 month ago 0 0 1 0

🔸 As part of this large screen, we also conducted a large-scale systematic wet lab comparison to contemporary binder design methods. Proteina-Complexa outperforms the baselines; see chart below.

In particular its self-generated sequences work well - no more re-design.

(12/n)

1 month ago 1 0 1 0

🔸 First, we performed a high-throughput massive-scale screen across 127 diverse and challenging targets. We screened around 1 million candidates in total, measuring all-to-all binding.

86 targets yielded experimentally validated hits, with 74 of those being specific.

(11/n)

1 month ago 1 0 1 0

🔸 Now experimental results. We conducted a massive campaign in collaboration with @ManifoldBio, @viva_biotech, @novonordisk, @Cambridge_Uni, @DukeU, @LMU_Muenchen.

🧬 "Latent Generative Search unlocks de novo Design of Untapped Biomolecular Interactions at Scale."

(10/n)

1 month ago 1 0 1 0

🔸 Quantitatively, our inference-time scaling strategies (we use, for instance, MCTS, Feynman-Kac Steering and Beam Search) outperform previous hallucination methods under normalized compute budgets, setting a new state-of-the-art in in-silico binder design.

(9/n)

1 month ago 1 0 1 1

🔸 Beyond regular binding, our model excels at atomistic motif scaffolding for enzyme design. On the AME benchmark, Proteina-Complexa significantly outperforms RFDiffusion2, faithfully reconstructing complex active site geometries.

(8/n)

1 month ago 0 0 1 0

🔸 We can explicitly optimize for biophysical properties during generation. In particular, Proteina-Complexa leverages interface hydrogen bond optimization, which steers the generative search toward candidates with denser, more stable interaction networks.

(7/n)

1 month ago 0 0 1 0

🔸 To overcome the scarcity of multimer data, we introduce Teddymer: a synthetic dataset of 0.5M clustered binder-target pairs, constructed by splitting AFDB monomers into structural domains, simulating realistic protein-protein interactions.

Link to data on project page.

(6/n)

1 month ago 0 0 1 0

🔸 "Latent Generative Search": We unify generative modeling with test-time optimization for high performance protein design. Scaling compute at inference via strategies like beam search and MCTS, we can steer the model toward higher-quality, physically realistic binders.

(5/n)

1 month ago 0 0 1 0

🔸 Technical Core: We use La-Proteina's partially latent flow matching framework that co-designs protein sequence and atomistic structure jointly.

No discrete amino acid tokens.

No separate inverse folding (no sequence re-design).

Fully end-to-end atomistic generation.

(4/n)

1 month ago 0 0 1 0

🔸 Proteina-Complexa is a new protein binder design framework leveraging generative pretraining and test-time compute scaling.

It can generate in silico binder candidates for diverse targets, including single and multi-chain proteins and small molecule ligand targets.

(3/n)

1 month ago 0 1 1 0

🔸 We present two papers covering Complexa's core method development and a large-scale experimental validation effort.

Let’s dive into the ICLR 2026 Oral paper on the method first: "Scaling Atomistic Protein Binder Design with Generative Pretraining and Test-Time Compute"

(2/n)

1 month ago 0 0 1 0

📢📢 Proteina-Complexa 📢📢

Atomistic Binder Design with Generative Pretraining and Test-Time Compute + Experimental Validation at Scale

⭐️ Project page (research.nvidia.com/labs/genair/...) for:

📜 Method paper (ICLR' 2026 Oral)
🧬 Wet lab paper
🛠️ Code & Models
📁 Data

🧵 Thread

(1/n)

1 month ago 11 2 1 1

Partially-latent flow matching enables sequence-structure codesign of large proteins and functional motif scaffolding.

@kdidi.bsky.social @machine.learning.bio @karstenkreis.bsky.social @arashv.bsky.social

arxiv.org/html/2507.09...

9 months ago 25 8 1 1

3⃣ Efficient Molecular Conformer Generation with SO(3) Averaged Flow-Matching and Reflow
openreview.net/forum?id=1B1...

⭐️ I'm also on a panel on synthetic data (synthetic-data-iclr.github.io)!

I'm excited to discuss research and to meet new and old friends and collaborators! 🎉

(5/n)

11 months ago 0 0 0 0

@gembioworkshop.bsky.social
papers:

1⃣ EquiJump: Protein Dynamics Simulation via SO(3)-Equivariant Stochastic Interpolants
arxiv.org/abs/2410.09667 (oral)
(screenshot below)

2⃣ Hierarchical Protein Backbone Generation with Latent and Structure Diffusion
arxiv.org/abs/2504.09374

(4/n)

11 months ago 0 0 1 0

3⃣ Energy-Based Diffusion Language Models for Text Generation
arxiv.org/abs/2410.21357
Posters 2

4⃣ Truncated Consistency Models
arxiv.org/abs/2410.14895
Posters 4
(screenshot below)

(3/n)

11 months ago 0 0 1 0

Main track:

1⃣ Proteina: Scaling Flow-based Protein Structure Generative Models
research.nvidia.com/labs/genair/...
Orals 3B, posters 4
(video below)

2⃣ ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids
arxiv.org/abs/2503.05025
Orals 2C, posters 3

(2/n)

11 months ago 0 0 1 0

🔥 I'm at ICLR'25 in Singapore this week - happy to chat!

📜 With wonderful co-authors, I'm co-presenting 4 main conference papers and 3
@gembioworkshop.bsky.social papers (gembio.ai), and I contribute to a panel (synthetic-data-iclr.github.io).

🧵 Overview in thread.

(1/n)

11 months ago 3 1 1 0

🔥 ProtComposer (ICLR'25 Oral) is a Swiss Army knife:

(i) Manually create new protein structure layouts? ✅
(ii) Generation with favorable designability/diversity/novelty trade-offs? ✅
(iii) Spatially edit given proteins? ✅

Very original work by the amazing @hannes-stark.bsky.social and Bowen Jing!🔥

1 year ago 7 2 0 0

Proteina: Scaling Flow-based Protein Structure Generative Models Proteina: Scaling Flow-based Protein Structure Generative Models

🔸Check out our project page (research.nvidia.com/labs/genair/...), our paper (arxiv.org/abs/2503.00710), and our code (github.com/NVIDIA-Digit...).

🔥 We released 8 sets of weights, for all experiments, for you to play with! 🔥

Enjoy! And see you at ICLR'25! 😀

(11/11)

1 year ago 2 0 0 0

🔸Proteina is a fantastic collaboration with wonderful colleagues at NVIDIA:

🔥 Tomas Geffner*, @kdidi.bsky.social*, Zuobai Zhang*, Danny Reidenbach, Zhonglin Cao, @jyim.bsky.social , Mario Geiger, @machine.learning.bio, Emine Kucukbenli, @arashv.bsky.social, @karstenkreis.bsky.social* 🔥

(10/n)

1 year ago 7 1 1 0

Posts by Karsten Kreis