@lorenzopantolini.bsky.social and I are headed to @iclr-conf.bsky.social at Rio soon, with talks about this work at @gembioworkshop.bsky.social and LMRL workshops. Reach out to chat about representation learning for de novo protein design! 🫖
Posts by Lorenzo Pantolini
I'm excited to share *Stoic*, a method for fast and accurate protein complex stoichiometry prediction directly from sequence. Preprint: www.biorxiv.org/content/10.6... 🧵👇(1/10)
Stoic: Fast and accurate protein stoichiometry prediction (preprint header with authors and affiliations)
Meet Stoic from @daniil-litvinov.bsky.social and @ninjani.bsky.social: embeddings to predict stoichiometry of protein complexes from sequence fast and accurately 🧬🧩💻🤩
www.biorxiv.org/content/10.6...
The Critical Assessment of Structure Prediction (CASP) experiment is calling for prediction targets: Immune Complexes, Organic Ligand-Protein Complexes, Nucleic Acids and Complexes, Conformational Ensembles, Difficult Protein Structures and Complexes. Rule of Thumb: If AlphaFold3 can generate a high-quality model, it is likely not a CASP-grade challenge. If it struggles, we want it.
Is #AI hitting a plateau in structure prediction? Help us find out at CASP17! 🧪🧬
Calling for Targets: Immune Complexes, protein - ligand complexes, RNA/DNA, conformational ensembles, membrane proteins, viral origins, and large complexes.
The Rule of Thumb: If AF3 can’t model it, we want it.
Remote homology and protein design: two sides of the same coin. Instead of finding remote homologs, we used TEA to design completely de novo proteins, folding into desired TEA sequences.
I always love working with Jay, and “speed-running” this proof of concept was no exception.
🚀 New paper in @natmethods.nature.com!
We present OpenStructure's powerful scoring capabilities, used to assess predictionsin CAMEO and CASP.
Read the full study here:
🔗 doi.org/10.1038/s415...
#StructuralBiology #Bioinformatics #OpenStructure #CASP #CAMEO #ProteinStructure
I’m presenting this work at the EMBO Computational Structural Biology Workshop in Heidelberg #EMBOComp3D this week, and @workshopmlsb.bsky.social in Copenhagen over the weekend. Let’s connect!
Huge thanks to my co-authors: @lauraengist.bsky.social, @ievapudz.bsky.social @martinsteinegger.bsky.social, @torstenschwede.bsky.social, especially to @ninjani.bsky.social! Couldn't have done this without the whole team, including the Swiss-Model development team and the rest of the Schwede group.
Try it out yourself! github.com/PickyBinders/tea. A web-service for search is coming soon at alphabet.scicore.unibas.ch.
Ultimately, TEA brings deep learning representation to protein sequence bioinformatics algorithms, such as profiles, phylogenetic trees, motif finding, multiple sequence alignments, and more, all while maintaining the speed and low resource consumption of amino acid sequences. (6/n)
We used TEA to connect >1.5 million singletons in AFDB Clusters, proteins which slipped past structure-based clustering approaches due to disordered or repetitive regions or simply because of low confidence structure predictions. (5/n)
TEA sequences come with a built-in confidence metric in the form of Shannon entropy, which we saw correlates with pLDDT, and can be used to filter out uncertain predictions. (4/n)
Running MMseqs2 with TEA gives extremely fast and highly sensitive results, similar to structural searches, even on unseen folds! Check out our ablations to see how we ended up with the final architecture. (3/n)
By using a contrastive objective, we trained an alphabet enriched with structural information, without the need for the actual structure. This approach ensures that remote homologs expressed with TEA maintain high sequence identity. (2/n)