Advertisement · 728 × 90

Posts by Burstein lab

Post image

How do you kill a MRSA superbug armed with 15 different anti-phage defense systems? You make a smarter phage. Check out our latest preprint on overcoming bacterial immunity using defense-guided engineering to build durable therapeutic phage cocktails! Led by Sarah Voss. doi.org/10.64898/202...

3 weeks ago 81 43 3 1

🚀 Our results highlight a promising direction for making protein language models more efficient and scalable. Read all about it! www.biorxiv.org/content/10.6... 5/5

2 months ago 0 0 0 0
Post image

⚡ Reduced alphabets yield shorter inputs and major runtime gains, while maintaining comparable, and sometimes improved, predictive performance. (4/5)

2 months ago 0 0 1 0
Post image

🔤 By combining Byte Pair Encoding (BPE) with reduced amino acid alphabets based on residue properties, we train new pLMs and evaluate them across diverse biological tasks, like solubility, enzyme, PPI, and stability prediction. (3/5)

2 months ago 2 0 1 0
Post image

⚖️ Unlike natural languages, proteins aren’t clearly separated into “words”, making tokenization tricky. Short tokens create long sentences, while long tokens lead to a sparse vocabulary that is hard to learn. But reducing the alphabet size might help! (2/5)

2 months ago 1 0 1 0

🧬 New preprint alert!
Protein language models have transformed biology - but what about the tokens they read?
In our new preprint, 👑@EllaRannon👑 studies how tokenization choices shape pLM performance and efficiency. 🧵 (1/5)
www.biorxiv.org/content/10.6...

2 months ago 9 1 1 0
B-PPI-DB: A Benchmarking Dataset for Bacterial Protein-Protein Interactions B-PPI-DB is a benchmarking dataset designed for the training and evaluation of bacterial protein-protein interaction (PPI) prediction models. The dataset is derived from the STRING database (version 1...

4/4 To allow standardized benchmarking for future tool development, we also released B-PPI-DB, a curated bacterial PPI database derived from STRING: doi.org/10.5281/zeno...

We hope B-PPI is just the first of many efficient bacterial PPI predictors!

3 months ago 0 0 0 0

3/4 B-PPI outperforms other rapid methods for bacterial PPI prediction without the high cost of structural folding and generalizes to unseen interactions with minimal fine-tuning.

3 months ago 0 0 1 0
Post image

2/4 B-PPI is a cross-attention model for bacterial PPI prediction at scale. Given protein pairs, it leverages ProstT5, a structure-aware protein language model, to generate embeddings, and outputs the interaction probability.

3 months ago 0 0 1 0
Post image

1/4 Ever wanted to predict bacterial protein-protein interactions (PPI) on a large scale?
We wanted to, but realized there’s no such algorithm that is both rapid and optimized for bacterial protein analysis.
This led our ⭐️Chen Agassy⭐️ to develop B-PPI: doi.org/10.64898/202...

3 months ago 5 3 1 0
Advertisement
Preview
Leveraging Natural Language Processing to Unravel the Mystery of Life: A Review of NLP Approaches in Genomics, Transcriptomics, and Proteomics Natural Language Processing (NLP) has transformed various fields beyond linguistics by applying techniques originally developed for human language to the analysis of biological sequences. This review ...

6/6 🔮 What's next for NLP in biology? We discuss future directions as well. Join us in exploring the future of this exciting field! arxiv.org/abs/2506.02212

10 months ago 1 0 0 0
Post image

5/6 💡 Discover how NLP is being applied to:
• Protein structure prediction 🏗️
• Taxonomic classification 🌳
• Mutational effect prediction 🔀
• Gene expression prediction 📈
And much more!

10 months ago 2 0 1 0
Post image

4/6 🧩 Tokenization challenges? We've got that covered too! Explore different approaches to breaking down biological sequences and their impact on model performance.

10 months ago 1 0 1 0
Post image

3/6 📚 We break down the evolution of NLP models in biology, from classic word2vec to cutting-edge transformers and hyena operators. Understand their strengths, limitations, and exciting applications!

10 months ago 0 0 1 0
Post image

2/6 🔬 We dive deep into how NLP techniques are revolutionizing the analysis of biological 'languages':
• DNA 🧬
• RNA 🧬
• Proteins 💪
• Entire genomes 🔍
Learn how these methods are unlocking new insights in genomics!

10 months ago 1 0 1 0
Post image

1/6 🧬📊 Curious about the buzz around NLP in biology? Feeling overwhelmed by the rapid developments? We've got you covered! Our review on NLP applications in genomics, by the wonderful Ella Rannon, is now out as a pre-print! #NLP #Bioinformatics arxiv.org/abs/2506.02212

10 months ago 47 15 1 1

Hello Bluesky! Burstein lab just joined - are we too late for the party 👀 ??

10 months ago 16 2 1 0