Hi Elias. I know how you feel! This whole project was triggered by a failed attempt using 60 days of GPU compute... Just run IARA, and you will see a sea of "blue" surfaces. Let me know how it goes and if you encounter any problems.
Posts by Leonardo Almeida-Souza
13/13 🧵
IARA facilitates binder design for everyone. Grab your protein of interest and, within seconds, check if a binder design campaign is viable. Available via CLI, PyMOL/Chimera plugin, and Google Colab!
github.com/leodeals/IARA
Special thanks: @helsinki.fi @hilife-helsinki.bsky.social @csc.fi
12/13 🧵
"Leo, I use RFdiffusion or BoltzGen for my binder design. Is IARA useful?" Yes! The principles BindCraft uses are shared among these generative models, so IARA can be used alongside those tools too.
11/13 🧵
How good is IARA? Well, it rocks! It correctly predicts where BindCraft produces binders in synthetic proteins (unseen in training), on mammalian surface proteins, and on proteins with unnatural “dark folds” (Harteveld, Cell Systems 2024).
10/13 🧵
The resulting model is IARA (Interface Analysis and Recognition Architecture), named after a Brazilian folkloric figure from the Amazon River—the "mãe das águas". Fun fact: my dad’s side has roots in tribes from that region (Tupi-Guarani).
9/13 🧵
Residues on the interaction surfaces were labeled, proteins were converted to graphs, and a model was trained using an attention GNN (GATv2).
8/13 🧵
For each target-binder complex, I extracted 7 features: hydrophobicity, residue charge, patch charge (10Å radius), residue surface exposure, side chain surface accessibility, local residue density, and local geometry.
7/13 🧵
With the best folded RFdiffusion designs, I started the long journey to assemble a dataset. 55,000 hours of GPU compute later, I gathered around 1,000 successful runs and started preparing the data for training.
6/13 🧵
(Keep in mind, I was doing this project on my own in my spare time, between meetings, lectures, and running a 10-person research group!)
5/13 🧵
To avoid these issues (which would require heavy manual curation), I used an RFdiffusion -> ProteinMPNN -> AlphaFold pipeline to generate random proteins with specified sizes (150-250 AAs) to be used as targets.
4/13 🧵
To focus on BindCraft's logic and avoid biases, targets couldn't be part of complexes, or have PTMs, transmembrane domains, signal peptides, or amphipathic helices. They also had to be <350 AAs to avoid GPU OOM errors.
3/13 🧵
I wondered if an ML model could learn the rationale behind BindCraft's binder design. For that, I needed to select a large number of targets to build a dataset of BindCraft runs.
2/13 🧵 This research was inspired by how variable results can be after thousands of GPU hours using BindCraft. While for some targets I could generate hundreds of binders fast, for others (looking at you, AP2 µ2 subunit 😤), I had none after hundreds of hours of compute.
1/13 🧵
‼️Preprint alert.
Are you interested in synthetic protein binders? Do you need to generate them?
I present to you IARA: a fast, slim ML model that tells you which protein surfaces are good for binder design.
www.biorxiv.org/content/10.6...
You will get an email from me in the next few days! Either with questions or saying how perfectly it worked! 🤞
I think this is the most useful tool for the democratization of scripts for biological research in a long time.
Excited to share my first PhD project: LabConstrictor 📒🐍
Have you ever created a Jupyter notebook with all your love 🫶, only for others to be unable to install it 🥲? LabConstrictor, comes to solve this!
Check out how it works in the preprint 🔗 arxiv.org/abs/2603.107... or follow this thread ⬇️
This is super awesome. Very useful to make tools available to the wider community on the long term. I am about to release a tool myself and I will try to compile it with labconstrictor straight away.
Beautifully done.
Today, our animation synthesizing decades of research on actin-mediated endocytosis in budding yeast was published:
journals.biologists.com/jcs/article/...
The result of a fantastic Iwasa-Drubin lab collaboration.
@margotriggi.bsky.social @jiwasa.bsky.social
movie.biologists.com/video/10.124...
Great tool. After all, it is not because your manuscript is a pre-print that it does not deserve to be beautifully formatted.
Great opportunity for a Research Support Officer to join @cellbiol-mrclmb.bsky.social, supporting @deriverylab.bsky.social in their studies of the molecular mechanisms behind receptor sorting in endosomes.
More info: www.nature.com/naturecareer...
Apply by 10 JUL
#ScienceJobs
Are you looking for a postdoc?
I will be looking for 2-4 people after the summer to join the lab !!
Do not hesitate to reach out and disseminate !!
Thanks AP.
Wow. A grant and a paper rejection on the same day... I need a beer. And tomorrow, head down and keep going. It is going to be ok.
Thanks!
Yep. I've been there.
The second is far more silly. When I finished figure 1, there was an awkward gap between the graphs. So I found something to fill it. But anyway. I am sure they will not be in the final version of the paper... they are very likely the first to the chop when space reatrictions and revisions kick in.
Fair point. I sometimes wonder if sometimes we go too far. The reasons for the pics are two: first, most people will only open the paper. Scan the figures and move on. The more visual info I give them, the better they may remember the paper in the future and come back for a proper read.
Please let us know if you have any comments on our manuscript or in the technique. We are happy to share the things we have produced and/or to collaborate and help with your protein of interest.
#Share #OpenScience