New preprint! The same TFs can drive distinct regulatory programs depending on where they bind.
TSS → rapid stress responses
Intronic & upstream → cell-type programs
Enhancer-like CRMs → embryo/meristem programs
Coding-sequence binding → repression
www.biorxiv.org/content/10.6...
Posts by sharon greenblum
our new preprint is out! if a transcription factor binds near a gene, does it matter exactly where?
keep an eye out for a 🧵soon!
Sequencing methods 🧬 have come a long way from census counts 🔢 to assembling genomes and understanding the functions 🧪 of bacteria and viruses.
New review by @jgi.doe.gov's Gitta Szabó, @emileyeloe-fadrosh.bsky.social, Tanja Woyke with @jeffinerca.bsky.social
www.nature.com/articles/s41...
How do populations maintain an evolutionary memory? I am happy to share that our work with Dmitri Petrov @petrovadmitri.bsky.social, Paul Schmidt, and colleagues on dominance reversal and stabilization of insecticide resistance in changing environments over time is now published at Nature EE.
Do YOU use data from across multiple sources? (See list in link below)
We want to know:
✅HOW you gather comprehensive data for the same sample;
✅WHERE you look; and
✅WHAT OBSTACLES you face
Sign up by 8/31 to share: jointgeno.me/UserResearch2025
@berkeleylab.lbl.gov @nigelmouncey.bsky.social
Graphic announcing a Computational Postdoctoral Fellow position. Text: Computational Postdoctoral Fellow – Plant Genomic Language Models (gLMs). JGI is seeking a Computational Postdoctoral Fellow to join a funded project at Berkeley Lab’s Joint Genome Institute applying genomic language models (gLMs) to unlock new biological insights in plants. Join one of the world’s leading genomics facilities to pioneer AI applications in plant biology, with access to cutting-edge computational resources and diverse genomic datasets. This project will focus on developing and exploring applications of gLMs to plant genomes, from identifying novel regulatory elements to revealing evolutionary patterns across species, with particular interest in cross-species and data-limited scenarios. In this role, you will design tasks and datasets to test and extend gLM capabilities, apply models to generate new biological hypotheses, and share findings through high-impact publications and presentations.
Postdoc opportunity with @tomasbruna.bsky.social here at @jgi.doe.gov @berkeleylab.lbl.gov:
Develop, benchmark, and apply Plant Genomic Language Models (gLMs) 🌿🧬💻
jobs.lbl.gov/jobs/computa...
#AI #PlantGenomics #gLMs #Postdoc
these are all real DAP-seq peaks!
specifically, peaks in the promoter of SWEET11 (a sugar transport gene) and its orthologs across 10 plant species
the darker colored peaks are binding sites for transcription factors that have maintained regulation of SWEET11 across 100+M years of plant evolution
If you are interested in using DAPseq for your plant, algal, fungal or microbial genomes, consider applying to one of @jgi.doe.gov's user programs:
jgi.doe.gov/work-with-us...
Final note - all of this data is publicly available! See our GEO ‘superseries’ page here: www.ncbi.nlm.nih.gov/geo/query/acc.cgi and please reach out if you have questions or trouble finding anything. We’re excited to see what you uncover!
This project was a huge team effort with five(!) co-first authors, including JGI scientists @leobaumgart.bsky.social, @abmora.bsky.social, Peng Wang, and Yu Zhang each playing crucial roles, and @omalley-regulome.bsky.social at the helm.
An amazing group!
The Sudmant lab at UC Berkeley is seeking a postdoc to work on a fully funded NIH project to understand differences in DNA repair and somatic mutation across the primate tree of life. Please spread widely to those who may be interested aprecruit.berkeley.edu/JPF05052
And on a practical level, tracking TF activity instead of expression of individual genes gives us quantitative big-picture way to compare cell types within a species, as well as across both closely and distantly related lineages.
Take-home!
DAP-seq + snRNA-seq
🤜 🤛
Integrating these data types with the framework of comparative genomics is an incredibly powerful way to understand gene regulation, giving us a window into why genes are expressed where they are, and how TF regulons get rewired to enable novelty.
But we also found cases where TFs became active in new cell types, and added hundreds of new target genes along the way. We even found a cool example of evolution in action, where the balance of regulatory power seems to be switching between MYB and NAC TFs in xylem.
First - we saw that plenty does stay the same. We could often recognize a sorghum cell type as the best match to a brassica cell type simply by looking at its TF activity profile.
But we weren’t done yet! Not all functional binding sites are conserved forever, right? Otherwise how do we get novelty? We next jumped across the tree to the bioenergy grass Sorghum and used the same approach to track TF activity, this time with binding sites conserved in its grass relative, rice.
Many TF-celltype relationships reflect long-standing functional knowledge. For example, MYB107 target genes lit up suberized-endodermis, aligning with the TF's known role in suberin synthesis. Others were entirely new. This opens doors for creative ways to describe and even manipulate cell types.
We now had a robust and powerful framework for tracking TF activity. Focusing only on expression of target genes with conserved binding sites, we could infer where each TF was active. Mapping out the active TFs in each cell type gave us the big-picture view of gene expression we’d been waiting for.
Sure enough, the most conserved binding sites showed the highest correlation between TF and target gene expression, and marked genes with the most cell type-specific expression patterns in all 4 species. Makes for a strong case that conservation is a good marker of binding site importance.
So next we generated snRNA-seq atlases for 3 different tissues (seedling, leaf, and flower) of each of the 4 Brassica species. Again, lots of data. But again, multiplexing helps! We found that extracting and profiling nuclei from all species together made for an easier protocol and cleaner data.
Binding sites shared across all 4 species had all the hallmarks of being ‘functional’ - they had lower within-species nucleotide diversity, higher chromatin accessibility, and were near functionally-related genes. But to be sure they impact expression - maybe we should look at expression data?
We tested this theory with 4 related species from the Brassica family. For every TF binding site in A. thaliana, we asked how many of the other species had a binding site for the same TF near an orthologous gene.
Second challenge: while DAP-seq finds all possible binding sites, not all actually impact expression of a nearby gene. Which binding sites matter? For that we turned to comparative genomics. We reasoned that binding sites that stuck around during evolution are probably there for a reason.
If that sounds like a lot of data - it is. The first challenge was profiling all those TF binding sites - our team previously developed multiDAP-seq (www.nature.com/articles/s41...), which applies the in vitro DAP-seq method to a pool of genomes. Here we optimized it for bigger eukaryotic genomes.
Single-nuclei RNAseq reveals which genes are ‘on’ where, but behind the scenes it’s transcription factors directing the show. Here, we profiled binding of 100s of TFs in 10 plant genomes, plus a suite of new snRNA atlases - to ask: does tracking TF activity help simplify and compare snRNA data?
August’s Nature Plants cover story shows how we integrated large-scale multiDAP and snRNA data to reveal drivers of cell type identity and evolution in flowering plants. www.nature.com/nplants/volu... We packed a lot into this paper! Here’s a single-cell spin on what we found:
Excited to share a new preprint w/ the Sonnenberg lab, led by Matt Carter, @zzzhiru.bsky.social & @mattolm.bsky.social. We analyzed the microbiomes of two non-industrialized populations from opposite sides of the globe to try to reconstruct the recent evolutionary history of our gut microbiota.
I am seeking a postdoc for my group at UCLA. We work at the intersection of population genetics x microbiome (garud.eeb.ucla.edu). If interested, please message me!
How is functional variation at large-effect loci maintained in natural populations?
Thrilled to share our work showing how beneficial dominance reversal helps fruit flies maintain a resistance polymorphism as selection varies in their environment! A thread 🧵 1/n
www.biorxiv.org/content/10.1...
The Dark Energy Spectroscopic Instrument (DESI) scientists mapped how nearly 6 million galaxies cluster across 11 billion years of cosmic history. Their observations line up with what Einstein's theory of general relativity predicts. 🧪
newscenter.lbl.gov/2024/11/19/n...