Look for this manuscript to be online within the next couple of weeks on bioRxiv! Also, if you want to try out the data let me know as we are testing a new data packaging format.
Posts by Jeff Vierstra
These data correspond to >1,000 individuals and we performed phased genotyping and allelic imbalance analysis with regulatory genomes (millions of SNVs). We have also uniformly processed and integrated >10,000 publicly available ATAC-seq datasets.
We are finally putting the final touches on an operationally complete mapping of regulatory DNA via DNase I in both human (>4,000 samples) and mouse. To interact with the samples we created this neat browser interface complete with a chatbot!
Check out ths CBS news segment about me. I may be the first person in world to be recieving a preventive ASO therapy for inherited familial ALS and it appears to be working! Still things to be hopeful for despite the absolute dumpster fires occurring all over the world.
youtu.be/1BdZb67w43s
We have also created a data portal complete with a coding chatbot to access the raw and processed data (>4,000 reference DNase I and >10,000 ATAC-seq samples). Coming soon to bioRxiv!
Using only training data from >4,000 human reference DNase I datasets we can also predict mouse DNase I. Also, it's readily tunable with the enormous amount of publicly available ATAC-seq data deposited to SRA.
Exciting results! We developed a single, generalizable ML model that can predict chromatin accessibility across any arbitrary cell type using a sample intrinsic and portable embedding. Notably works on samples generated over a 15 yr time interval with different technologies & methodologies.
I am well aware that one can find a sentence in the "Acknowledgments" section, but seems insufficient given that their whole project is only possible because of these resources.
Surprised (but also not that surprised) that the AlphaGenome paper didn't officially cite any of the primary data used for training their model (see Fig. 1, thousands of datasets made with tremendous time and effort over >15yrs). What's up with that @nature.com ?
www.nature.com/articles/s41...
Getting up-close and personal with the Patagonian fjords near the Beagle Channel.
SPrUCE: Utilizing Ultraconserved Elements of DNA for Population-Level Genetic Diversity Estimation www.biorxiv.org/content/10.1101/2025.11....
Double full rainbow at the end of the world at Cape Horn, Chile 🇨🇱
With all the wild stuff going on in the States (and the world) I am going to escape reality for a while on a sailing trip to the end of the world around Cape Horn and the Beagle Channel (named after the HMS Beagle of Charles Darwin and Robert Fitzroy fame). Thinking this might be type 2 fun...
🥁This Wednesday , in #FragileNucleosome seminar, we are excited to host @hannahlong.bsky.social and @jeffvierstra.bsky.social to tell us about amazing work they are doing!
🗓️Register here for upcoming session and the entire series:
us06web.zoom.us/webinar/regi...
Southern fjords of Chile on a boat.
It’s been a pleasure to organize the Rules of Protein-DNA Recognition meeting in Cancun. Spectacular talks and an amazing and supportive scientific community!
Great resource! I should mention (since it's not on the website) that all of the chromatin accessibility data (DNase I) was generated at the UW & Altius Institute over the course >15years. The proper references for these data are: www.nature.com/articles/nat... and www.nature.com/articles/s41....
Please apply to our tenure-track faculty position at
@stanford-chemh.bsky.social! We are searching for a new colleague working at the interface between computation and molecular sciences. See post below and pls forward widely!
chemh.stanford.edu/opportunitie...
Looks like a great couple of months of seminars! Come check out my talk on November 5th if you want to learn about our progress in mapping the nucleotide-resolved structure and function of cis-regulatory DNA elements across thousands of cell types and states.
Wild to see a thread about me. I think the broader topic (as Jason points out) is what does the future of preventive medicines look like for at risk gene carriers? I also hope this gives people some hope to those dealing with devastating and (previously) unactionable inherited genetic diseases.
Does one sample (or even 10) suffice to define core cell type regulatory elements? NO! Because of both biological and technical variability you need to profile many (typically >15). The additional peaks are enriched for trait associated variants, so you miss a lot of possibly important signal.
Look at this and tell me I am wrong : DNaseI footprinting data is unparalleled in genomics. ~700 high quality datasets for an upcoming ENCODE data drop.
We also showed in a 2015 manuscript that the RREB1 site CCCCCACCC, also has a modest effect on HbF reactivation.
Activity determining nucleotides on the BCL11A +58 enhancer according to a ML model built purely on DNase I data from thousands of cell types (this is just prediction for erythroid cells). Not bad w.r.t. functional data. The GATA1 site is the therapeutic target of Casgevy for SCD and B-thal.
For some reason I was re-reading the DEseq2 paper and was reminded of what a statistical masterpiece that method is. Every time I read the paper I seem to learn something new. Not too many papers achieve that bar (at least for me).
Dear Dr. Bhattacharya: I have listened to your performance on the “War Room” podcast with Steve Bannon. The segment begins with a discussion of Secretary Kennedy’s recent cancellation of $500M worth of contracts related to mRNA vaccines. You say “You can’t have a platform where such a large percentage of the population distrusts the platform as we use it for vaccines and expect it to work.” Later, you make comments that might explain why a large segment of the percentage distrusts this platform. For example, you say that the vaccine was not protective against contracting COVID and cite your own case on COVID after being vaccinated. However, you fail to cite the evidence for the clinical trials that led to the Emergency Use Authorization.
I listened to Bhattacharya on Steve Bannon's "War Room" podcast.
If you want to know how it went, see the following email that I just sent.
1/13
Jeff Vierstra was likely doomed by his DNA. A radical experiment gave him a chance to rewrite his fate — before ALS symptoms ever began.
www.statnews.com/2025/07/28/a...
hotspot3: our chromatin accessibility peak caller is now a package – "pip install hotspot3" to try it out.
You might know that my life mostly revolves around skiing. I am organizing a 25 day sail & ski trip to Antarctica in Dec. 2025 and have space for 1-2 more people. We leave from Ushuaia, AR on the Tierra del Fuego (early Dec.) DM me for details and pass this around if you know anyone interested!
Also, all the credit goes to @sboytsov.bsky.social and @sabromav.bsky.social.