It's finally done (enough for a preprint)! Today, in collaboration with so many folks at Meta (shout-out Daniel Levine and Muhammed Shuaibi, who put in superhuman levels of work), Berkeley, Stanford, NYU, and more, I'm proud to announce the Open Molecules 2025 (OMol25) dataset!
#CompChem #ML 🧪 ⚗️
Posts by Nathan C. Frey
The phrase "pathologically disorganized" is used to describe a neural network trained to predict protein structure from sequence embeddings
A post by @ncfrey.bsky.social and @amyxlu.bsky.social on repurposing ESMFold for protein design, featuring one of my favorite phrases in the field ncfrey.substack.com/p/hit-the-vi...
🔥 Benchmark Alert! MotifBench sets a new standard for evaluating protein design methods in motif scaffolding.
Why does this matter? Reproducibility & fair comparison have been lacking—until now.
Paper: arxiv.org/abs/2502.12479 | Repo: github.com/blt2114/Moti...
A thread ⬇️
This was a massive, 3+ effort with 60+ collaborators, and we're still only scratching the surface of what's possible in biology, drug design, and ML. Incredibly proud of this team and honored to be a part of this research and sharing it with the world.
We @prescientdesign.bsky.social Genentech pre-printed our "Lab-in-the-loop for therapeutic antibody design." We built a general ML system to accelerate molecule design for challenging, therapeutically relevant targets.
www.biorxiv.org/content/10.1...
CDS PhD student @angie-chen.bsky.social presents LLOME, using LLMs to optimize synthetic sequences with potential applications for drug design.
Co-led by @activelearner.bsky.social & @ncfrey.bsky.social and others at @prescientdesign.bsky.social
nyudatascience.medium.com/language-mod...
New blog post with Aya Ismail on our recent work, which introduces a fundamentally new way to build foundation models that are interpretable by design for scientific discovery.
tinyurl.com/cb-plm-blog
My team @prescientdesign.bsky.social is hiring! 🎉
Join me, @stephenra.com, @keunwoochoi.bsky.social, @kyunghyuncho.bsky.social, and the Large Molecule Drug Discovery AI/ML and LLM teams to work on basic research and applications of LLMs to drug discovery.
Link to apply: tinyurl.com/prescient-lmdd
Join us in NYC as a graduate student intern at Prescient Design, Genentech this summer to work on fundamental research in 3D generative models, with applications to protein design!
Apply directly, and please share with anyone who may be interested!
roche.wd3.myworkdayjobs.com/en-US/ROG-A2...
Incredible work led by @amyxlu.bsky.social introducing PLAID, an all-atom co-generation method for proteins that requires only sequence inputs for training data! Read Amy's thread below, with links to the preprint, code, and model weights!
👇
get in early on the best podcast for bio, chem, and ML. fill the void in your life of entertaining, technical content made by experts for experts (and enthusiasts!).
A common question nowadays: Which is better, diffusion or flow matching? 🤔
Our answer: They’re two sides of the same coin. We wrote a blog post to show how diffusion models and Gaussian flow matching are equivalent. That’s great: It means you can use them interchangeably.
Overview of the BioM3 framework for protein design via natural language prompts.
Wet-lab validation
BioM3: a model that generates proteins conditioned on text prompts, with wet-lab validation!
www.biorxiv.org/content/10.1...
A weekend project from a while back -- this little package (with no dependencies) allows you to interact with pymol remotely.
I use it a lot for my protein design workflows together with @biotite.bsky.social.
Just `pip install pymol-remote`
The first list filled up, so here's a second list of AI for Science researchers on bluesky.
Let me know if I missed you / if you'd like to join!
bsky.app/starter-pack...
I'm making a list of AI for Science researchers on bluesky — let me know if I missed you / if you'd like to join!
go.bsky.app/AcP9Lix
🧪 For people interested in AI & enzymes (enzyme engineering, design, discovery, ...), I'm assembling a starter pack for us.
DM if you'd like to be included!
go.bsky.app/MhfaQBh
bsky.app/profile/kevi...
Two BioML starter packs now:
Pack 1: go.bsky.app/2VWBcCd
Pack 2: go.bsky.app/Bw84Hmc
DM if you want to be included (or nominate people who should be!)
In a gratuitous attempt to acquire more followers myself 😁, I've made a start on a "starter pack". Hopefully as more people from 🐦 make it over to 🦋, we can extend this a bit. Suggestions welcome!
I've noticed not all accounts seem to be eligible to be added, anyone know what's up with that? 🤔
I tried to make a bioml starter pack. DM if you want me to add or remove you?
go.bsky.app/2VWBcCd
I’ve started an AI in healthcare starter pack! Let me know who is missing. go.bsky.app/7PeNwep
Made a biotech starter pack because I want to meme with y'all on this site instead of the old one
go.bsky.app/TbKxUEk 🧪🧬💻
👋
Hello BlueSky! 👋 I'm Nathan, a scientist at Prescient Design • Genentech. I lead an incredibly talented team of researchers working to transform drug discovery through computation, AI/ML, engineering, and data-centric thinking.
Find out more about our team and our work on ncfrey.github.io