Advertisement · 728 × 90

Posts by Nick Boyd

Preview
Target Policy Optimization In RL, given a prompt, we sample a group of completions from a model and score them. Two questions follow: which completions should gain probability mass, and how should the parameters move to realize...

I haven't been following the RL literature closely but I find it hard to believe this hasn't been tried before. I want it to work; anyone tried it?

1 day ago 3 1 0 0
Preview
mosaic/examples/protenij_vhh.py at main · escalante-bio/mosaic composite-objective protein design. Contribute to escalante-bio/mosaic development by creating an account on GitHub.

it's unfortunate that they used the AF3 training date cutoff again. hallucination example here: github.com/escalante-bi...

1 week ago 1 0 0 0
Preview
GitHub - bytedance/Protenix: Toward High-Accuracy Open-Source Biomolecular Structure Prediction. Toward High-Accuracy Open-Source Biomolecular Structure Prediction. - bytedance/Protenix

added Protenix V2 to mosaic -- I wonder if if improved ab-antigen prediction translates to better hallucinated designs.

1 week ago 7 0 1 0

I think we (as a field) are getting closer to fast, cheap de novo binders but still lots of work to do

2 weeks ago 1 0 0 0

We thought designing VHH binders with a $50 computational budget was just out of reach -- turns out this was right. For many applications high computational cost isn't a barrier, but if we want these things to be like PCR primers we need to improve cost and reliability.

2 weeks ago 1 0 1 0
Boolean Biotech

@btnaughton.bsky.social's VHH competition results are out! There were two entrants: me and Brian. We both lost

2 weeks ago 6 2 1 0

I really like this model, it's readable, fast, and generates very good binders. Also, it can be used for much more than beam search. For instance, in this notebook we demonstrate side-chain packing and inverse folding.

3 weeks ago 2 1 0 0
Preview
mosaic/examples/proteina.py at main · escalante-bio/mosaic composite-objective protein design. Contribute to escalante-bio/mosaic development by creating an account on GitHub.

Added a JAX translation of the excellent Proteina-Complexa (from nvidia, @kdidi.bsky.social , @karstenkreis.bsky.social ) to mosaic. You can do beam search with any mosaic loss (e.g. protenix + mpnn) and JAX with generate efficient GPU/TPU code.

3 weeks ago 13 6 1 0

as noted in the README I don't endorse this project. Also, the first time you launch it, it will take a few minutes to build a container, download model weights, and JIT compile design + ranking functions. then it should be fast

Start with the boltzgen RL weights; much faster than hallucination

1 month ago 1 0 0 0
Video

Do you have modal.com credits you need to light on fire? Do you want to feel like a hacker while vibe designing protein minibinders? Try running `uvx --from 'mosaic-tui @ git+https://github.com/escalante-bio/mosaic-tui mosaic --pdb 1ubq` from your terminal. No need to install anything.

1 month ago 9 1 1 0
Advertisement
Preview
SKILL.md GitHub Gist: instantly share code, notes, and snippets.

Still needs serious handholding but Claude Code works really, really well with the method I've previously used to convert torch projects to JAX. This skill is itself vibed, so there are some minor errors...

1 month ago 0 0 0 0
Video

Translated @moalquraishi.bsky.social 's OpenFold3 (OF3p2 for those in the know) into JAX. Fully open models + data are rad. You can use this now in github.com/escalante-bi... for ranking or binder design

1 month ago 23 3 1 0
Preview
GitHub - diff-use/sampleworks: Framework for modified sampling from biomolecular generative models Framework for modified sampling from biomolecular generative models - diff-use/sampleworks

this same modification might substantially improve guidance for folding models (e.g. for @diffuseproject.bsky.social's sampleworks): my hunch is guidance eventually fails as you can't push a vanilla AF3 structure module beyond structures consistent with the trunk's embedding.

1 month ago 2 0 0 0
From SeedProteo (https://arxiv.org/abs/2512.24192). Similar modification in PPIFlow (https://github.com/Mingchenchen/PPIFlow/tree/main)

From SeedProteo (https://arxiv.org/abs/2512.24192). Similar modification in PPIFlow (https://github.com/Mingchenchen/PPIFlow/tree/main)

Two of my favorite recent binder design papers (PPIFlow and SeedProteo) make the same modification to the AF3 architecture: instead of using a fully-amortized triangle-layer free diffusion module, pass noisy coordinates into the main trunk. Obviously computationally expensive, but seems to work

1 month ago 8 2 1 0
Post image Post image Post image Post image

This has been up for a while but I haven’t really publicized it. Introducing ciMIST: sparse, self-consistent network models of local and global protein conformational entropy, learned from molecular dynamics. This helps with analyzing MD and connecting to experiments

www.biorxiv.org/content/10.1...

1 month ago 32 10 2 2

On a technical level it's pretty cool that a sequence-only reward function works for RL of a structure generating model. Clearly better reward functions + RL algorithms are possible, but post-training certainly seems promising for these models

1 month ago 1 0 0 0
Relative performance of base model v.s. RL'd model on held-out target

Relative performance of base model v.s. RL'd model on held-out target

Post image

This appears to generalize outside of the training structures. Some interesting and potentially disturbing trends in the RL-generated structures: they're almost all pure helix bundles with a huge enrichment of A's and E's -- it's possible that's the source of the cofolding confidence improvements.

1 month ago 2 0 1 0
Preview
Teaching generative models to hallucinate There are currently two main approaches to computational protein binder design: optimization (exemplified by BindCraft) and generative models (e.g. BoltzGen). A design campaign using either method l...

New post: blog.escalante.bio/teaching-gen.... You can massively improve in silico metrics for BoltzGen using standard post-training techniques (with a structure model as your reward function). If this holds up (no wetlab testing yet!) you could get binders in seconds rather than hours...

1 month ago 18 5 1 0

Really good intro to some of the tools you might need in a protein binder design effort

1 month ago 4 0 0 0
Advertisement
Video

sometimes I wonder if Claude Code really does make me more productive. sure, it implemented this much faster than I could have, but I probably would have had the sense not to...

1 month ago 21 1 4 0
Preview
mosaic/examples/protenij_vhh.py at main · escalante-bio/mosaic composite-objective protein design. Contribute to escalante-bio/mosaic development by creating an account on GitHub.

I've been testing this model a bit for design: github.com/escalante-bi... . Seems to work very well in general. For VHH could probably use higher PLM weight or something to better constrain the CDRs; for globular binders results look good.

1 month ago 2 0 0 0
Dictionary Index mmcif_pdbx.dic PDBx/mmCIF Data Dictionary Dictionary Index mmcif_pdbx.dic

every time I have to revisit the mmcif spec I want to cry: mmcif.wwpdb.org/dictionaries...

1 month ago 1 0 0 0

I wonder if an architecture where the entire model participated in diffusion (inc triangle layers) would work better for these applications. Obvious computational efficiency reasons not to do this, but it sometimes seems the trunk has completely made up its mind before diffusion...

2 months ago 1 0 0 0
Representative results for a single target

Representative results for a single target

this is from a hyperparameter sweep with 10 benchmark targets using roughly this code: gist.github.com/nboyd/8e4f32.... I haven't actually run BindCraft; could be mosaic/protenix-specific. Also, these binders *are* different even if they have similar iptms; in vitro results might be worse

2 months ago 2 0 0 0

sad this .gif doesn't constitute reproducible scientific truth

2 months ago 0 0 1 0
Video

if you don't like the huge extended helices or alpha solenoid proteins you're getting from hallucination-based protein design methods (bindcraft, mosaic, etc), increasing the scale of the initial sequence noise (typically Gumbel) increases funkiness without hurting final metrics like ipTM

2 months ago 21 2 1 0
Post image

Excellent, comprehensive rundown of the state of bio lab automation by @owlposting1.bsky.social

In retrospect, it's an important topic that has had almost zero discussion over the years!

A fun surprise to see some decade-old(!) work show up in there too.

www.owlposting.com/p/heuristics...

2 months ago 6 2 1 0
Advertisement

Naturally I had to add this to mosaic. Here's a VHH designed using `protenix_base_20250630_v1.0.0`. Example notebook here: github.com/escalante-bi...

2 months ago 4 1 1 0
Preview
GitHub - bytedance/Protenix: Toward High-Accuracy Open-Source Biomolecular Structure Prediction. Toward High-Accuracy Open-Source Biomolecular Structure Prediction. - bytedance/Protenix

Protenix v1.0 is out with some very impressive performance numbers (exceeding AF3 performance on protein-protein complexes)

2 months ago 8 3 1 1

Obviously these models aren't perfect and are trained on finite data, the data generating distribution doesn't really exist, there are better ways to control generative models, etc etc etc. This is still often a surprisingly illuminating way to think about these models.

2 months ago 1 0 0 0