This work is a four-year journey that would not be possible without my amazing partners @mailejim.bsky.social, Mounica Vallurupalli, and Kai Cao, and other co-authors. And thank you to my PI Fei Chen, the Golub Lab, @broadinstitute.org, @harvardmed.bsky.social, for your support! (10/10)
Posts by Dawn Chen
Our work shows that RNA splicing can be harnessed as a new modality for cell type-specific gene regulation, unlocking new possibilities for gene therapy and precision medicine. It also highlights how high-quality experimental data powers AI models. (9/10)
We further applied SPICE to design sequences that specifically target cancer cells carrying splicing factor mutations. We identified synthetic sequences that are selectively spliced only in cells with RBM5 or RBM10 mutations, which are commonly seen in lung cancers. (8/10)
To go beyond prediction, we built Melange, a generative model that designs new sequences with programmed cell type-specific splicing. We experimentally validated that Melange can design sequences that splice only in neural-lineage cells, and not cells from other lineages! (7/10)
Using this dataset, we built Soma, a deep learning model that predicts how any RNA sequence will splice across cell types.
We drew inspiration from ChromBPNet (@anshulkundaje.bsky.social) to encode cell type-specific information directly from gene expression data. (6/10)
We profiled 46,000+ sequences across 43 cell lines spanning 10 lineages – the most diverse set of cell types ever tested in an MPRA (>10x more than previous studies) – and uncovered widespread cell type–specific splicing events. (5/10)
Good ML models require high-quality experimental data, but large-scale, cell type-resolved data on RNA splicing is limited. Hence, we built SPICE (Splicing Proportions In Cell types), an integrated experimental + AI framework for the generative design of cell type-specific RNA sequences. (4/10)
Beyond existing cell type targeting strategies like promoters, enhancers, or viral vectors, we asked: can we design RNA sequences that use alternative splicing as an underexplored biological process to control cell type-specific gene expression? (3/10)
Every cell in your body contains the same DNA, but different cell types – like neurons or cancer cells – perform vastly different functions. Being able to turn genes on or off in specific cell types is key for understanding biology and for building precise, safe gene therapies. (2/10)
Announcing our new preprint! We built SPICE, a framework that combines large-scale experiments and generative AI to design RNA sequences that control cell type-specific gene expression using alternative splicing – a powerful, underexplored modality.
Preprint: www.biorxiv.org/content/10.1...
(1/10)
Thank you to the award committee, my PI Fei Chen and my lab, all my mentors, collaborators, family and friends for your support!
1/ In two back-to-back papers, we present our de novo TRACeR platform for targeting MHC-I and MHC-II antigens
TRACeR for MHC-I: go.nature.com/4gcLzn5
TRACeR for MHC-II: go.nature.com/4gj5OQk
So at the beginning of my postdoc I struggled to amplify Chlamy genes (easily 75%+ GC) and did pretty much the same comparison and found the same results
DMSO= helps but kills pol processivity
Betaine= Lots of spurious products
1,2-propandiol= processivity retained and usually specific product
I took biochem in 2001, and for nearly 20 years read amino acid sequences daily… and I never knew Dayhoff named them or even the logic behind things like Q until last Friday (h/t Mike Janech). Also, this is another big Dayhoff moment for me. She was incredible!
#proteomics #bioinformatics
Did you know there’s a site to search for starter packs of people to follow? Built by @mubashariqbal.com
blueskydirectory.com/starter-pack...
Nah you're still a cow! Mooooo.
Congratulations to the team for this work! An innovative way to use #CRISPR 🧬 ! This study uses CRISPR-gRNA to barcode small extracellular vesicles, enabling subpopulation-specific analysis of their biogenesis and release. A groundbreaking tool for studying sEV heterogeneity!
#MolecularBiology
Here’s my spreadsheet of starter packs (>80!) related to broadly ‘mechanistic biology’ plus some intriguing extras
Complete with collective nouns
I’ve been tracking these but now can’t keep up
Hope it’s helpful
2/2
docs.google.com/spreadsheets...