Matthieu Schapira (@mattschap) Bsky

What if, instead of trying to predict properties of every molecule, we focus on simply ranking them? After all, when running Bayesian optimization (BO) for drug/materials discovery, what matters is picking the best candidates first.

Paper: doi.org/10.1063/5.02...
Code: github.com/gkwt/rbo
[1/5]

7 months ago 8 3 1 0

CACHE 7 is launched with support from the @gatesfoundation.bsky.social and unpublished data from Damian Young at @bcmhouston.bsky.social, Tim Willson @thesgc.bsky.social and Neelagandan Kamaria InSTEM. Design selective PGK2 inhibitors. We'll test them experimentally.
bit.ly/4lnVYOs

8 months ago 11 6 0 0

New Practical Cheminformatics Post
patwalters.github.io/Three-Papers...

8 months ago 18 9 0 2

New @chemrxiv.bsky.social preprint!

RoboChem-Flex is a powerful, low-cost (<5k EUR), modular self-driving lab for chemical synthesis

We showcase 6 studies (photochemistry, biocatalysis, cross coupling, ee ...), all optimized with different configurations & ML

🔗 chemrxiv.org/engage/chemr...

9 months ago 73 21 3 5

1/4
🚀 Announcing the 2025 Protein Engineering Tournament.

This year’s challenge: design PETase enzymes, which degrade the type of plastic in bottles. Can AI-guided protein design help solve the climate crisis? Let’s find out! ⬇️

#AIforBiology #ClimateTech #ProteinEngineering #OpenScience

9 months ago 23 20 1 4

Identification of nanomolar adenosine A2A receptor ligands using reinforcement learning and structure-based drug design - Nature Communications Here the authors combine a deep generative model with structure-based drug design and prospectively validate functionally active, nanomolar, A2A adenosine receptor ligands and solve their crystal stru...

Closing the loop on #GenAI for #GPCR #SBDD

www.nature.com/articles/s41...

tinyurl.com/yeyhfr7j

9 months ago 17 7 1 0

Now in JCIM: pubs.acs.org/doi/full/10....

10 months ago 6 5 0 0

🚀 After two+ years of intense research, we’re thrilled to introduce Skala — a scalable deep learning density functional that hits chemical accuracy on atomization energies and matches hybrid-level accuracy on main group chemistry — all at the cost of semi-local DFT ⚛️🔥🧪🧬

10 months ago 72 25 3 7

CACHE4 results are out! All previously known CBLCB ligands shared the same scaffold. Congrats to Keunwan Park who successfully designed a chemically novel series, to the experimental team at @thesgc.bsky.social and thanks to @conscience-network.bsky.social for greasing the wheels! bit.ly/4mYNe3r

10 months ago 8 5 1 0

This week's cover of @rsc.org @chemicalscience.rsc.org AIMNet2: a neural network potential to meet your neutral, charged, organic, and elemental-organic needs. pubs.rsc.org/en/content/a... #compchem #chemsky

10 months ago 33 11 3 0

Excited to unveil Boltz-2, our new model capable not only of predicting structures but also binding affinities! Boltz-2 is the first AI model to approach the performance of FEP simulations while being more than 1000x faster! All open-sourced under MIT license! A thread… 🤗🚀

10 months ago 214 91 10 11

#fennix #machinelearning #gtc25 #cecam #watoc #vivatech #ai #docking #gpu… | Jean-Philip Piquemal If you want to know more about #FeNNix-Bio1, the first #machinelearning foundation model able to perform accurate - long timescale- condensed phase molecular simulations of biological systems at quant...

#compchem #machinelearning If you want to know more about #FeNNix-Bio1, the first foundation model able to perform accurate - long timescale- condensed phase molecular simulations of biological systems at quantum accuracy, join me in incoming live presentations:
www.linkedin.com/feed/update/...

10 months ago 6 1 0 1

Our new preprint PharmacoForge: Pharmacophore Generation with Diffusion Models is out now! PharmacoForge quickly generates pharmacophores for a given protein pocket that identify key binding features and find useful compounds in a pharmacophore search. Check it out! 🧪 doi.org/10.26434/che...

10 months ago 21 9 1 0

The Open Molecules 2025 dataset is out! With >100M gold-standard ωB97M-V/def2-TZVPD calcs of biomolecules, electrolytes, metal complexes, and small molecules, OMol is by far the largest, most diverse, and highest quality molecular DFT dataset for training MLIPs ever made 1/N

11 months ago 46 10 5 1

Acellera/AceFF-1.1 · Hugging Face We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Check out AceFF 1.1

huggingface.co/Acellera/Ace...

11 months ago 3 1 0 0

A Foundation Model for Accurate Atomistic Simulations in Drug Design Neural network potentials now offer robust alternatives to electronic structure and empirical force fields computations for the on-the-fly production of the potential energy surfaces required in atomi...

#compchem New preprint: "A Foundation Model for Accurate Atomistic Simulations in Drug Design"

FeNNix-Bio1, a foundation #machinelearning model for biosimulations

doi.org/10.26434/che...
#compchemsky #biosky

Great work by T. Plé & the teams @lct-umr7616.bsky.social & @qubit-pharma.bsky.social

11 months ago 23 5 0 3

👋 🤖 Meet El Agente–an autonomous AI for performing computational chemistry, made by the Matter Lab @uoft.bsky.social. This #LLM-powered multi-agent system making computational chemistry more accessible will soon be available worldwide. Sign up 4 the launch: acceleration.utoronto.ca/news/meet-el...

11 months ago 17 9 1 4

First DREAM Target 2035 Drug Discovery Challenge 'First DREAM Target 2035 Drug Discovery Challenge' (Synapse ID: syn65660836) is a project on Synapse. Synapse is a platform for supporting scientific collaborations centered around shared biomedic...

@thesgc.bsky.social is generating large/open screening data and inviting data scientists to train their ML models via DREAM challenges:
1- train your model on DEL data
2- retrospectively predict 138 ASMS true positives
3- predict new hits. We will test them and publish together.
bit.ly/3YXVKoT

11 months ago 5 6 0 0

"De novo prediction of protein structural dynamics"

I'll be presenting an overview of the field tomorrow at a workshop. Link to a PDF copy of the presentation: delalamo.xyz/assets/post_...

11 months ago 70 17 5 1

Encode protein structures as a series of discrete tokens, train a language model, and sample protein structural conformations given the sequence.

arxiv.org/abs/2410.18403

11 months ago 42 9 2 0

AlphaFold is amazing but gives you static structures 🧊

In a fantastic teamwork, @mcagiada.bsky.social and @emilthomasen.bsky.social developed AF2χ to generate conformational ensembles representing side-chain dynamics using AF2 💃

Code: github.com/KULL-Centre/...
Colab: github.com/matteo-cagia...

1 year ago 205 63 3 5

a depiction of the active learning cycle: smaple-train-predict-repeat

New preprint: Finding Drug Candidate Hits With a Hundred Samples: Ultra-low Data Screening With Active Learning doi.org/10.26434/che... #compchem

1 year ago 13 3 0 0

AI drug development’s data problem The future of drug discovery may be artificial intelligence (AI), but its present is not. AI is in its infancy in the field. To help AI mature, developers need nonproprietary, open, large, high-qualit...

The future of AI-drug discovery hinges on large, high-quality, standards-based datasets. No country or firm can build this alone. We need to construct datasets together in the open: www.science.org/doi/10.1126/... @tridentpct.bsky.social @conscience-network.bsky.social @mcgilluniversity.bsky.social

1 year ago 6 3 0 2

Google Colab

Run BioEmu in Colab - just click "Runtime → Run all"! Our notebook uses ColabFold to generate MSAs, BioEmu to predict trajectories, and Foldseek to cluster conformations.
Thanks @jjimenezluna.bsky.social for the help!
🌐 colab.research.google.com/github/sokry...
📄 www.biorxiv.org/content/10.1...

1 year ago 102 42 1 2

AlphaFold is running out of data — so drug firms are building their own version Thousands of 3D protein structures locked up in big-pharma vaults will be used to create a new AI tool that won’t be open to academics.

AlphaFold, the revolutionary, Nobel prize-winning tool for predicting protein structures, has a problem: it’s running low on data

https://go.nature.com/3FJRyTd

1 year ago 47 13 0 0

Conscience Symposium on Open Drug Discovery - Submission and registration are open! - Conscience Registration is now open for the second annual Conscience Symposium on Open Drug Discovery! Join us at the Society for Arts and Technology in Montreal on April 7-8, 2025, for two days of insightful ta...

🚀One week left to register for our Symposium on Open Drug Discovery! Join us in Montreal April 7-8 for an exciting program showcasing how open science and AI are driving drug discovery. Some sessions are already sold out, so register now! Register by April 2nd: conscience.ca/symposium2025

1 year ago 1 2 0 0

The QCML dataset, Quantum chemistry reference data from 33.5M DFT and 14.7B semi-empirical calculations - Scientific Data Scientific Data - The QCML dataset, Quantum chemistry reference data from 33.5M DFT and 14.7B semi-empirical calculations

This is a remarkable paper! A gigantic dataset of highly precise, highly accurate first-principles data. This builds on years of work on @fhi-aims.bsky.social - enabling dispersion-corrected hybrid DFT that covers a huge swath of chemical space. Congrats to the authors!

doi.org/10.1038/s415...

1 year ago 37 8 2 1

🚀 100 scientists. 31 countries. And we’re just getting started.

#MAINFRAME is uniting global experts to drive AI-powered hit finding. ML models trained on real experimental data. Predictive tools tested in the lab.

🔗 Now is the time to join: aircheck.ai/mainframe

1 year ago 5 4 0 1

The Need for Continuing Blinded Pose- and Activity Prediction Benchmarks Computational tools for structure-based drug design (SBDD) are widely used in drug discovery and can provide valuable insights to advance projects in an efficient and cost-effective manner. However, d...

New Perspective on Community Benchmarking in Structure-Based Drug Design (SBDD)!

#SBDD predictions need reliable benchmarks - diverse targets, high-quality affinity & structural data, and blinded validation. Let’s make it happen!

🔗 Read more: doi.org/10.1021/acs....

#DrugDiscovery #CompChem

1 year ago 19 8 1 1

The code & camera-ready version of our #ICLR2025 paper on "Multi-domain Distribution Learning for De Novo Drug Design" are now available

📚 Paper: openreview.net/forum?id=g3V...

💻 Code: github.com/LPDI-EPFL/Dr...

(1/4)

1 year ago 25 7 2 1

Posts by Matthieu Schapira