Advertisement · 728 × 90

Posts by Koushik

Can we simulate realistic evolutionary trajectories and “replay the tape of life”? In this work, we propose a flexible, generalizable deep learning framework for modeling how the entire protein sequence evolves over time while capturing complex interactions across sites. 1/n
doi.org/10.64898/202...

1 month ago 83 35 3 1
Post image Post image

@jengreitz.bsky.social l & my lab want to co-hire a computational biologist/biostatistician with project management expertise to help map the regulatory code of the human genome and discover genetic mechanisms of disease.

Details below
careersearch.stanford.edu/jobs/computa...

Plz RT

8 months ago 54 59 1 1
Post image Post image

In 1965, Margaret Dayhoff published the Atlas of Protein Sequence and Structure, which collated the 65 proteins whose amino acid sequences were then known.

Inspired by that Atlas, today we are releasing the Dayhoff Atlas of protein sequence data and protein language models.

8 months ago 66 28 3 3

please add me too. I work on ML for Plant Biology

9 months ago 1 0 0 0
Post image

An assessment of DNA language models concludes:
◼️ They do not offer compelling gains over baseline models

Their performance is inconsistent and requires much more compute.

arxiv.org/abs/2412.05430

9 months ago 51 19 1 3

Our structural core gene pipeline Unicode is now published at GBE
📄 doi.org/10.1093/gbe/...

Please also check out @dongwookkim.bsky.social’s
🧵 bsky.app/profile/dong...

10 months ago 44 19 2 0
"A cacao tree with fruit pods in various stages of ripening. Taken on the Big Island (Hawaii) in the botanical gardens."
"Chocolate is created from the cocoa bean. A cacao tree with fruit pods in various stages of ripening."
Photo by Medicaster, Wikimedia

"A cacao tree with fruit pods in various stages of ripening. Taken on the Big Island (Hawaii) in the botanical gardens." "Chocolate is created from the cocoa bean. A cacao tree with fruit pods in various stages of ripening." Photo by Medicaster, Wikimedia

The only reason you love chocolate is because of FUNGUS.

Cacao seeds contain high amounts of polyphenols, making them intensely bitter & unpleasant. There are two natural fungi that do the heavy lifting in turning them into chocolate.

Let's do a quick tour of the process of chocolate making.

10 months ago 493 125 13 17

Three BioML starter packs now!

Pack 1: go.bsky.app/2VWBcCd
Pack 2: go.bsky.app/Bw84Hmc
Pack 3: go.bsky.app/NAKYUok

DM if you want to be included (or nominate people who should be!)

1 year ago 149 58 16 6
Post image

AFESM: a metagenomic guide through the protein structure universe! We clustered 821M structures (AFDB&ESMatlas) into 5.12M groups; revealing biome-specific groups, only 1 new fold even after AlphaFold2 re-prediction & many novel domain combos. 🧵
🌐 afesm.foldseek.com
📄 www.biorxiv.org/content/10.1...

11 months ago 141 70 4 4
Advertisement
Preview
Leveraging genomic deep learning models for non-coding variant effect prediction The majority of genetic variants identified in genome-wide association studies of complex traits are non-coding, and characterizing their function remains an important challenge in human genetics. Gen...

Super excited to share our review on genomic deep learning models for non-coding variant effect prediction, with Ayesha Bajwa and Nilah Ioannidis. We’d like this review to be a useful resource, and welcome any feedback, comments, or questions! 1/4

arxiv.org/abs/2411.11158

1 year ago 34 13 1 1
Overview of SAE methodology and representative SAE features revealed through automated activation
pattern analysis

Overview of SAE methodology and representative SAE features revealed through automated activation pattern analysis

Using mechanistic interpretability to steer generations

Using mechanistic interpretability to steer generations

SAE feature analysis and visualizations reveal features with diverse and consistent activation patterns

SAE feature analysis and visualizations reveal features with diverse and consistent activation patterns

Mechanistic interpretability on a protein language model

www.biorxiv.org/content/10.1...

1 year ago 48 15 1 0

Two BioML starter packs now:

Pack 1: go.bsky.app/2VWBcCd
Pack 2: go.bsky.app/Bw84Hmc

DM if you want to be included (or nominate people who should be!)

1 year ago 119 56 10 11
Preview
Uncertainty-aware genomic deep learning with knowledge distillation Deep neural networks (DNNs) have advanced predictive modeling for regulatory genomics, but challenges remain in ensuring the reliability of their predictions and understanding the key factors behind t...

DEGU distills an ensemble of models into a single model, retaining the ensemble’s predictive performance while providing uncertainty estimates - ie both epistemic (or model) and aleatoric (or data) uncertainty.

Led by @zrcjessica

Paper: www.biorxiv.org/content/10.1...

2/n

1 year ago 12 2 1 1
Preview
Ultrafast classical phylogenetic method beats large protein... Amino acid substitution rate matrices are fundamental to statistical phylogenetics and evolutionary biology. Estimating them typically requires reconstructed trees for massive amounts of aligned...

Large protein language models can learn complex epistatic interactions, but how much does that help with predicting variant effects? In this NeurIPS article, we show that classical independent-sites phylogenetic models can outperform pLMs on this task.
1/7
openreview.net/forum?id=H7m...

1 year ago 90 44 2 2
Post image

Thrilled to announce Boltz-1, the first open-source and commercially available model to achieve AlphaFold3-level accuracy on biomolecular structure prediction! An exciting collaboration with Jeremy, Saro, and an amazing team at MIT and Genesis Therapeutics. A thread!

1 year ago 610 204 18 25

I tried to make a bioml starter pack. DM if you want me to add or remove you?

go.bsky.app/2VWBcCd

1 year ago 89 39 29 6
Advertisement