Pranam Chatterjee (@pranam) Bsky

Could we accelerate the discovery of the next GLP-1R agonist? 🚀 Here, we introduce PepTune, a multi-objective guided discrete diffusion model that generates target-specific peptides, while optimizing their therapeutic properties! 🪐

📜: arxiv.org/abs/2412.17780
💻: huggingface.co/ChatterjeeLa...

1 year ago 14 1 0 0

So excited to host the 2nd GEM Workshop at ICLR 2025! 🎉 We have amazing speakers/panelists 🧑‍🔬, money for new AI+Experiment collabs 🤑, and we're partnering with @naturebiotech.bsky.social to get the best papers into review! 📜 Definitely submit your new work and see you in Singapore!! 🇸🇬

1 year ago 7 2 0 0

So excited to have Christian (@machine.learning.bio) join us at Duke!! 💙 We're building such an amazing AIxBio community with @rohitsingh8080.bsky.social, @alextong.bsky.social, Phil Romero, and others. ESPECIALLY in all things bio-based language models! 💻 🧬 Come join us in Durham! 😈

1 year ago 12 1 0 0

🚨 Current graduate students! If you're interested in developing and leveraging generative language models for therapeutics design, please apply to the
FutureHouse's postdoctoral fellowship and indicate my lab as an option! 😃 $125k salary and access to all of their amazing resources! 🌟

1 year ago 4 1 0 0

This Next Generation IVF Startup Facilitated The Birth Of A Baby For The First Time Doctors say IVF technology developed by Gameto, cofounded by Under 30 alumna Dina Radenkovic, has serious potential. Now it’s finally coming to market.

Surreal! 🤩 With co-founders Martin and Dina, we started Gameto in 2020 with just a silly graph theory algorithm I developed to predict TFs that could differentiate ovarian cells. 💻➡️🧫 Now, little Mia is here with the tech that has grown out of that work. 🐣 So proud!! 🥰
www.forbes.com/sites/alexyo...

1 year ago 19 3 3 1

Any AIxBio folks at NeurIPS and want to meet up with me and the lab? So many of our best collaborations have come from meetings at NeurIPS, ICML, and ICLR!! 🌟

1 year ago 7 1 0 0

We are so grateful to #EndAxD for funding our research leveraging generative language models to design peptide-guided degraders of dysregulated GFAP! 🙏 Please share and consider giving to this wonderful, grassroots organization. 💫 endaxd.org

#EndAxD Instagram Post: www.instagram.com/p/DC7sV2GPst...

1 year ago 4 0 0 0

Yes, definitely. A learned tokenizer is always more complex. The nice thing about ESM-2 is that it's a per-residue tokenization, and doesn't use BPE, SentencePiece, or some other irrelevant tokenizer. It allows us to get good residue-level embeddings. :)

1 year ago 1 0 0 0

I worry that during pre-training, the token embeddings ended up having quite expressive representations themselves. Using a special token would work, but you would need to really contextualize their token representations, just as the <mask> had. Otherwise, I could imagine a dropoff in performance.

1 year ago 0 0 1 0

GitHub - pengzhangzhi/faesm: FAESM: A Drop-in Efficient Pytorch Implementation of ESM FAESM: A Drop-in Efficient Pytorch Implementation of ESM - pengzhangzhi/faesm

Try out Fred's (my PhD student) reimplementation ESM2 with FlashAttention, achieving up to 60% memory savings and 70% faster inference! 🚀 No need to change your ESM code — it’s API-compatible! github.com/pengzhangzhi...

1 year ago 42 9 1 2

Yes we run most of the inference pipelines on A100s and H100s. Haven’t had a problem — A6000s have been fine as well.

1 year ago 1 0 0 0

Ooh such a good idea!! I’ll try it! :)

1 year ago 1 0 0 0

Great points! I actually never liked it either and most of the time, it’s hard to effectively debug with everyone watching. 😅

1 year ago 1 0 1 0

Alright new BlueSky friends, need some advice! 💡 I’m teaching my Generative Models (pLMs, graph models, diffusion, etc.) class at Duke next semester, and want to mix it up! Question: should I do theory on the board ✏️+ live coding 🧑🏾‍💻, or pre-prepared slides 🖥️ with annotated code snippets?

1 year ago 9 1 3 0

Of course!! Will do! The biggest test will be when we down select generated molecules based on Boltz-1 metrics and we’ll see if they work in the wet lab. 🧫

1 year ago 1 0 1 0

Accurate de novo design of high-affinity protein binding macrocycles using deep learning The development of macrocyclic binders to therapeutic proteins has typically relied on large-scale screening methods that are resource-intensive and provide little control over binding mode. Despite c...

New RFDiffusion-for-peptide (RFpeptide) paper from @gauravbhardwaj.bsky.social and team at @uwproteindesign.bsky.social! 🌟 Beautiful binding data on 4 highly-structured targets (pLDDT > 90)! 🙌🏾 Not too confident this would work on highly disordered targets, though. 🤔

www.biorxiv.org/content/10.1...

1 year ago 10 2 1 0

Yeah same. The ByteDance one, Proteinix is quite good and the engineering from them is always clean!

1 year ago 1 0 1 0

Yeah nothing easy about it! And the throughput is low that it’s hard to get a good look at hit rate of the algorithms without doing a mini display assay. Ahh such is life! 😅

1 year ago 0 0 1 0

We usually do some hacky ELISAs via biotinylation of the analyte and then SPR the best ones. A horridly cumbersome set of experiments. 😣

1 year ago 0 0 1 0

Ugh so true!! And as a lab that does peptides, why is it so slow and expensive to synthesize an 18mer is insanity. 🤦🏾‍♂️ Only alternative is to His-tag purify, which also sucks. And don’t get me started with Kd analysis…still no reliable high-throughput binding affinity measurement. 😣

1 year ago 0 0 2 0

Agreed!! We’re using the AF3 models to validate our language model-based binder designs to structured targets (and metals, DNA, etc) prior to experimental testing, as a sort of a hint on performance. But of course, the true test is in the lab for us!! 🧫

1 year ago 1 0 1 0

I’m curious to see how all of the new AF3 mimics perform. 🧐 My lab’s been installing them on our servers, and faster inference and ease-of-use are key for us. Boltz-1 has an early lead, but nothing beats a good frozen pLM with a structure trunk! 😅 Bc accuracy to the PDB isn’t the best metric. 🤷🏾‍♂️

1 year ago 48 4 3 2

Programmable protein degraders enable selective knockdown of pathogenic β-catenin subpopulations in vitro and in vivo Aberrant activation of Wnt signaling results in unregulated accumulation of cytosolic β-catenin, which subsequently enters the nucleus and promotes transcription of genes that contribute to cellular p...

Hi new followers! 🥰You may know me from Twitter as the sequence-first, pLM guy — hope you will continue to follow my lab’s work! 🥹 While you’re here, check out my lab’s new preprint on delivering pLM-generated degraders via LNPs to degrade cytosolic β-catenin in vivo! www.biorxiv.org/content/10.1...

1 year ago 11 2 0 0

A strategy that seems to be useful is using heterodimeric PDBs of single proteins and cutting interfaces — there’s a bit more conformational flexibility captured, and our LMs have done better with this noisier data.

2 years ago 0 0 0 0

We’ve worked to create a similar dataset with minimal leakage, but to do interface prediction from pLM residue embeddings. It’s super tough and we’ve yet to find a good train/test cluster-based split that would achieve this.

2 years ago 1 0 0 0

Which paper is this from? I'm not certain the latent spaces are compatible here to create useful protein representations.

2 years ago 0 0 1 0

SaLT&PepPr is an interface-predicting language model for designing peptide-guided protein degraders ... SaLT&PepPr is a protein language model that isolates peptidic motifs from the binding interfaces of target-interacting partner sequences. These peptides are fused to an E3 ligase domain to generat...

SaLT&PepPr is published in
Communications Biology! Here, we fine-tune the ESM-2 pLM to identify peptidic binding sites on target-interacting partner sequences. We fuse these "guide" peptides to E3 ubiquitin ligases to degrade disease-causing proteins! Take a read! :) www.nature.com/articles/s42...

2 years ago 4 0 0 0

PepMLM: Target Sequence-Conditioned Generation of Peptide Binders... Target proteins that lack accessible binding pockets and conformational stability have posed increasing challenges for drug development. Induced proximity strategies, such as PROTACs and molecular...

Happy to share our early work on generating binding peptides conditioned ONLY on the target sequence! 🌟 PepMLM masks cognate peptides at the end of target protein sequences, and tasks ESM-2 to fully reconstruct the binder region. 😷 arxiv.org/abs/2310.03842

2 years ago 5 0 0 1

“PepMLM: Target Sequence-Conditioned Generation of Peptide Binders via Masked Language Modeling” 🧶🧬

Fine-tunes ESM-2 network to achieve “target-conditioned de novo binder design from sequence alone”

arxiv.org/abs/2310.03842
huggingface.co/TianlaiChen/...

2 years ago 6 1 0 0

Posts by Pranam Chatterjee