Thanks for putting into words what Iβve struggled to articulate cogently and convincingly. The struggle, the time it takes, the critical thinking required to identify gaps β thatβs the point of the creative process. Too many folks treat the AI takeover as a forgone conclusion, when itβs clearly not
Posts by Ryan Z Friedman, PhD
A stack of old books and the title "Narratives of AI inevitability"
New post on Unprofessoring. Is there really "No Other Choice" or are we giving in to "narratives of inevitability"? Thanks to a post by @docfreeride.bsky.social for the framing.
Are grad students students or trainees or employees or some yet-to-be-discovered word?
I decided to try out blogging! My first post is a reflection of my time on science Twitter and why I decided to start blogging. I'm going to give Bluesky a second chance, but It's nice to have a space for casual long-form writing
ryanzfriedman.com/2025/09/06/s...
Thanks so much Tim!
Thanks so much Jacob!
Thank you Alex!!
I first had this idea 6.5 years ago, early on in grad school. This journey has been a long one. I couldn't have done it without the support and guidance from @genologos.bsky.social and Barak Cohen, our collaboration with @corbolab.bsky.social, or help with the modeling and analysis from my coauthors
Our solution is to let your model guide you. By focusing on uncertain sequences and testing them functional genomic assays, you can *iteratively* train a model. We applied this to understand why the same DNA sequence motif has radically different effects in different contexts.
My thesis work on active machine learning to model regulatory DNA is now out in Cell Systems!
We answer the question: When you can synthesize any DNA sequence you want, how do you decide which ones are worth testing?
www.sciencedirect.com/science/arti...
I posted an editorial yesterday on the need for neurodiversity in science where I also disclosed my own diagnosis as autistic. I also wrote a substack on my journey to the disclosure also in this thread. #actuallyautistic www.science.org/doi/10.1126/...
Many thanks to Yawei Wu, Lloyd Tripp, and Daniel Lyon for their help with these analyses!
The manuscript itself is also restructured. Figs 2 and 4 are swapped, there's a 5th fig for the K562 analysis, and we reworked the Discussion.
Apologies if threading isn't the way to go on Bsky. π§¬π
8/8
We analyzed a second pair of sequences with similar motif content. The model correctly predicts that the RORB motif must be 3' of the CRX motif.
These results show our model learns the context that distinguishes functionally non-equivalent motifs.
7/
RORB motifs have a wide range of effects when mutated. Our model predicts this correctly & these effects are correlated with motif affinity.
Along with our other results, this shows active learning generates the data needed to learn regulatory grammars.
6/
We have a new result showing that our model accurately predicts when CRX motifs increase vs. decrease expression. This is crucial because nc variants can change activity in unexpected directions, so it's important to have data that can tell when a motif has a positive vs negative effect.
5/
Our experiments suggest that inactive sequences are low-information training examples. This is important because large libraries derived from random DNA are mostly inactive seqs. We think iteratively training models on smaller but more informative training data is more effective
4/
When we did many rounds, active learning was more efficient, approached the upper bound with less data, and enriched for positive examples!
This demonstrates that active learning is broadly effective and illustrate that enriching for active sequences is more informative
3/
We tested active learning in a second system using Nadav Ahituv and @jshendure.bsky.social's genome-wide MPRA in K562s. We downsampled the data, trained a CNN, then sampled from the remaining data. Active learning consistently outperformed random sampling across many starting conditions.
2/
We substantially revised our active learning manuscript. A brief summary of what's new.
TLDR: several new analyses, benchmarking w a 2nd MPRA dataset, and a refocused argument on active learning to leverage the capacity of MPRAs to generate large datasets.
www.biorxiv.org/content/10.1...
1/8
π§¬π
A project I contributed to during grad school is now up at PLoS Comp Bio! We used the MAVE-NN package by @jbkinney.bsky.social's group to learn about the behavior of synthetic regulatory elements π§¬π Keep an eye on @genologos.bsky.social's Twitter for more details.
journals.plos.org/ploscompbiol...
I should note that I set up my folders somewhere around my third year of grad school and haven't meaningfully reorganized it since then, so I'm open to a complete overhaul.
I want to reorganize my @paperpile.bsky.social. I have folders covering broad topics with very few subfolders. Tags are for different manuscripts.
Is there anyone in genomics π₯οΈπ§¬ who wants to share how they organize their reference managers? I'm looking for more meaningful and detailed categories.
Next time I meet a techie asking how to move into compbio Iβll connect them with you!
Yeah, I struggle with finding a polite way to say βgo learn a bunch of biology or take a huge pay cut to be an entry level bioinformatician for a few yearsβ but there is definitely a mode of thought among some techies that they can watch 20 hours of YouTube videos and call it good
π€
I've had several software engineers ask me recently about transitioning into genomics/comp bio from tech. I never know what to say -- everyone I know in the field in industry went to school for biology/comp bio. Anyone have any suggestions, either of what to say or concrete resources to provide? π§¬π₯οΈ
Agreed that there is strong cell to cell variability. But changes in e.g. the ZRS enhancer of Shh can cause loss of limbs or gain of extra digits. Some of that is due to changes in Shh in space/time, but some is also due to how *much* Shh is produced
hmm...A difference in 2 vs 3 fold change can definitely matter! Over/underactive CREs can cause developmental defects and disease. Agreed that cell culture won't tell you space+time, but they can tell you information about how sequence features encode activity.
My expertise is also in distal elements, but I think that diffs in DBDs reflects different points in evolution. TFIIB and the sigma factors are homologous. But most DBD families in euks are totally absent from proks and vice versa! Suggests txn initiation is more conserved than specifying space-time
I'm not entirely sure, but my sense is that GTFs have structural classes of DBDs you don't see binding to distal regions