I'm excited to announce some major updates to our ProteinEBM paper with Chenxi Ou @sokrypton.org!
Posts by Sergey Ovchinnikov
Our paper with @sokrypton.org using AlphaFold2 to predict small-molecule binding sites in proteins is now out in Nature Methods! 🧵
rdcu.be/e7SnX
www.nature.com/articles/s41...
New preprint🚨
Imagine (re)designing a protein via inverse folding. AF2 predicts the designed sequence to a structure with pLDDT 94 & you get 1.8 Å RMSD to the input. Perfect design?
What if I told u that the structure has 4 solvent-exposed Trp and 3 Pro where a Gly should be?
Why to be wary🧵👇
As a bonus, here's a video of ProteinEBM folding up the fast-folder NTL9, rendered in stunning 2D by py2Dmol from @sokrypton.org! We hope models like ProteinEBM can serve as a step toward solving the "real" protein folding problem.
An energy-based model of protein conformational space can be used to predict structure from sequence, sample from the conformational landscape, rank structures, and predict mutation effects.
@sokrypton.org
www.biorxiv.org/content/10.6...
I'm super excited to announce the first preprint of my PhD, together with Chenxi Ou and @sokrypton.org!
ML has revolutionized protein modeling, but crucial challenges remain. For example, we can't reliably predict complicated protein structures without MSAs, which limits what we can design.
An interesting study from @aidenkzj.bsky.social, @abulnaga.bsky.social and @sokrypton.org that builds on our Progres model to find pairs of proteins with circular permutations:
www.biorxiv.org/content/10.1...
Thrilled to share that the final piece of my PhD work is now on bioRxiv! biorxiv.org/content/10.1... With support from @nvidia and the @NSF, we used AlphaFold to screen 1.6M+ protein pairs, revealing thousands of potential novel PPIs. All data can be viewed at predictomes.org/hp
Adding support for interactive MSA viewing, including coloring by entropy! Auto download from AFDB (for both uniprot and pdb entries). (4/4)
Save vectorized SVG for infinitely ♾ zoomable figures. (including support for contacts).
See example:
biorxiv.org/content/10.1...
Zhidian Zhang @yoakiyama.bsky.social @yehlincho.bsky.social @jajoosam.bsky.social
(3/4)
Adding record button to save animation. (2/4)
A few py2Dmol updates 🧬
py2dmol.solab.org
Integration with AlphaFoldDB (will auto fetch results). Drag and drop results from AF3-server or ColabFold for interactive experience! (1/4)
Is 3D dragging you down? Wish you could instead use the 2D ColabFold representation for all your work? 🤓
Introducing: py2Dmol 🧬
(feedback, suggestions, requests are welcome)
Working on the protein-hunter-chai google colab notebook. 😈
@yehlincho.bsky.social
Will it bind? A little worried about all the "TTTTTTT" 🧐 But looks cool 😎
Thrilled to announce our new preprint, “Protein Hunter: Exploiting Structure Hallucination within Diffusion for Protein Design,” in collaboration with @Griffin, @GBhardwaj8 and @sokrypton.org
🧬Code and notebooks will be released by the end of this week.
🎧Golden- Kpop Demon Hunters
Looks like someone has already tried to replace me with an AI agent 🫣
hmmm... any ideas why mpnn would make things worse for af2, but make things about the same as af2 when used with boltz?
Exciting to see our protein binder design pipeline BindCraft published in its final form in @Nature ! This has been an amazing collaborative effort with Lennart, Christian, @sokrypton.org, Bruno and many other amazing lab members and collaborators.
www.nature.com/articles/s41...
Now that OpenCRISPR is in nature and rekindled the 'what's-a-novel-sequence' debate, I'm happy to share an app to check this, which I built for fun some time ago.
fuerstlab.shinyapps.io/SeqNovelty/
quick 🧵
Yeah, we would expect the pseudo-likelihood to be maximized for best paired MSA!
We'll add an option to add custom MSA inputs to the notebook (later tonight). 😎
MMseqs2 v18 is out
- SIMD FW/BW alignment (preprint soon!)
- Sub. Mat. λ calculator by Eric Dawson
- Faster ARM SW by Alexander Nesterovskiy
- MSA-Pairformer’s proximity-based pairing for multimer prediction (www.biorxiv.org/content/10.1...; avail. in ColabFold API)
💾 github.com/soedinglab/M... & 🐍
It was expected. The surprising part for me was that AlphaFold doesn't seem to care....
See study here from @lindseyguan.bsky.social
www.biorxiv.org/content/10.1...
This makes me wonder if the reason it doesn't seem to care is because it was trained on and evaluated on poorly paired MSAs. 🤔
2Y69 is technically "prokaryotic" as it's from the mitochondria. 🧐
We find this to be true across a number of targets. Where method used to pair sequences and filter them makes a big difference. This we find to be important when trying to disentangle paralogs from orthologs (4/4).
The big difference is in the pairing. The MMseqs2 server pairs sequences based on species, while our old HHblits MSAs were paired based on genome proximity (number of genes apart). Working w/ @milot.bsky.social and @martinsteinegger.bsky.social we implemented the proximity filtering in server (3/4)
Side story: While working on the Google Colab notebook for MSA pairformer. We encountered a problem: The MMseqs2 ColabFold MSA did not show any contacts at protein interfaces, while our old HHblits alignments showed clear contacts 🫥... (2/4)
Excited to re-share work from
@yoakiyama.bsky.social and Zhidian Zhang on MSA pairformer. (1/4)