Excited to speak at the
@stanforddata.bsky.social Sustainability Data Science conference today!
I’ll be presenting ClimateX, our work on LLM confidence calibration in the climate science domain.
📄 Paper: arxiv.org/abs/2311.17107
💻 Code: github.com/rlacombe/Cli...
Posts by Romain Lacombe
That’s a bit of relief at least! 🐚🦪
Poster time! Fantastic workshop on experimental design in AI for Science today.
Paper: arxiv.org/abs/2503.17368
Workshop: edai4science.github.io
Nobel laureate talk time!
With David Baker at the Stanford experimental design in AI for Science today.
Pierre Clostermann during WWII
Thanks to Hannah @wastyk.bsky.social & Will at InterfaceBio for advice on sactipeptides, and to the authors of AlphaFold 2 & 3, ESMFold, Boltz-1 (@gcorso.bsky.social), and more!
📝 Link to paper: biorxiv.org/content/10.1...
💻 Link to code: github.com/rlacombe/Sac...
Excited to be presenting this work at the
@stanfordbiosci.bsky.social workshop on Experimental Design in AI for Science next week!
More info on the event here: edai4science.github.io
These results highlight the limits of the (extremely successful) evolutionary approach to protein structure prediction.
We will likely need physics-informed models to reach accurate 3D structure predictions on rare PTMs and out-of-domain proteins beyond evolutionary priors.
No model achieved crosslink distances near the experimentally observed 1.8Å sulfur-to-α-carbon bond length, with average performance at 10-12Å.
Most models incorrectly predicted disulfide bonds instead of thioether crosslinks, highlighting their reliance on evolutionary priors.
We evaluate 6 leading protein structure predictors—AlphaFold 2 and 3, Boltz-1, ESMFold, OmegaFold, and RoseTTAFold 2—on the 10 known sactipeptides.
All models exhibit limited performance, with an average GDT-TS of only 11.5% for known sactipeptides and 12.6% for unknown ones.
This helps us probe how deep learning models generalize beyond evolutionary priors.
We introduce a new, zero-shot benchmark for structure models: measuring how the 3D conformations predicted for sactipeptides match the geometry imposed by their known thioether crosslinks.
These non-canonical crosslinks are challenging and underrepresented in structural datasets. Crucially, *only 5* of the 10 known sactipeptides have a resolved 3d conformation with a PDB entry. *But* their 2d crosslink structure is known, which constrains possible 3d geometries!
Evolution-based protein structure prediction models have revolutionized structural biology but struggle with rare post-translational modifications (PTMs). We evaluate them on sactipeptides, a rare class of 10 known peptides with unique sulfur-to-α-carbon thioether crosslinks.
🔔🔔 New bioML paper alert! 🔔🔔
We evaluate evolutionary protein structure prediction models using the geometric structure of non-canonical crosslinks from a rare class of post-translationally modified peptides.
A thread. 🧵
"But you're so much more handsome than that!"
– my beloved wife, when shown a Ghiblified selfie of ourselves.
Right answer!! I'm swooning! 🥰🥰🥰
Pre-ordered! 🛍️🤞
This is much deeper than it sounds at first. Absolute nugget of wisdom.
Oooh that looks beautiful! When can we pre-order?
Anecdotally, @dcrainmaker.com has a treasure trove of GPS accuracy tests in land and marine conditions for most watches and trackers on the market.
This is an excellent talk by @francesarnold.bsky.social at ACS this week. Worth a read!
cen.acs.org/people/award...
LLMs are Farenheit 451 for digitized human knowledge.
There’s something comforting knowing that the ghost of every written word will live on, for as long as we can preserve open weights, and the GPUs to bring them to life.
Chemical engineering rules everything around us. ⚗️
AI 🤖, bio 🧬, chips 💻, energy 🔋, fertilizers 🚜, materials ⛏️, space 🛰️ ...
All the technologies shaping the world sit downstream from chemistry fundamentals. 👩🔬
Hold on to your periodic table and let's go! 🚀
Few things are more heart wrenching than a walk by a wild, pristine, remote beach at the confines of civilization…
…only to find billions of tiny speckles of microplastics all over the sand, which we *know* will end up in our bodies by way of the food chain.
Very sad.
This guy is *such* a jerk. Ugh! Only good thing is it’s out in the open for anyone to see.
1 failure per 1000 GPU•days is the going hardware MTF rate for large training runs these days.
#NeurIPS2024
Craziest talk of the day so far: you can recognize which room someone stands in with embeddings of firing rate maps of their place cells neurons!
#NeurIPS2024
Deep learning is chemical engineering for information: layers are process units, diffusion is transport, entropy is the loss function…
Don’t believe me? Then why is the NeurIPS opening talk literally starting with the Haber Bosch process!?
Made it to #NeurIPS2024!
Here until Sunday, ping me for coffee, a walk and talk, or some Tim Hortons maple glazed donuts! 🇨🇦
I used to add little incandescent light bulbs in my notes, for highlights.💡
But it’s time to decarbonize. 🔋
So now I add little LEDs! ⚡️
We can just fix things.