Advertisement · 728 × 90

Posts by Stein Aerts

Happy to share our new preprint on non-coding genetic variation in the human brain and Parkinson's disease. Great team effort with @alexanrna.bsky.social, @juliedeman.bsky.social, Koen Theunis, and all co-authors, supervised by @steinaerts.bsky.social and @jdemeul.bsky.social.
Thread below:

5 days ago 24 7 1 0

Very proud of this and so cool that enhancer-level models can predict the effect of genetic variation. There is so much personal variation in terms of gene regulation in the human brain, it is fantastic to uncover this thanks to technology (whole-genome sequencing and single-cell multiomics) and AI

5 days ago 15 4 0 0
Preview
Evolutionary transfer learning enables organism-wide inference of mammalian enhancer landscapes Understanding and modeling how the human genome encodes gene regulatory programs for thousands of cell types remains a central challenge in genomics and machine learning. However, most human cell types emerge during embryonic, fetal, and pediatric development which are inaccessible to comprehensive molecular profiling. To overcome this, we hypothesized that the mismatch in evolutionary rates between cis-acting enhancers (fast) and the trans-acting regulatory programs that interpret them (slow) creates an opportunity for ‘evolutionary transfer learning’. Specifically, models trained to predict cell type-specific enhancers in one species should generalize to the orthologous cell types and enhancers of related species. To test this, we generated a single-cell atlas of chromatin accessibility spanning mouse embryonic day 10 (E10) to birth (P0). Using combinatorial indexing1, we profiled 3.9 million nuclei from 36 staged embryos, resolving genome-wide accessibility in 36 cell classes and 140 cell types. With the goal of identifying distal enhancers for all cell classes, we trained a series of multi-output deep learning models (CREsted2), each addressing limitations of the preceding approach. An ‘evolution-naive’ model achieves strong performance on heldout peaks, but exhibited two failure modes during genome-wide inference: overprediction at tandem repeats and conflation of promoter and distal enhancer grammars. An ‘evolution-aware’ model resolves these by regrouping accessible regions based on functional coherence across syntenic orthologs, but fails to generalize across species — suggesting insufficient sequence diversity during training. Finally, STEAM (Synteny-aware Transfer learning for Enhancer Activity Modeling), our ‘evolution-augmented’ model, expands the training corpus to include enhancer orthologs from up to 241 mammalian genomes (Zoonomia3) in a synteny-supervised manner. This increases the effective data scale by up to 195-fold, markedly improving generalization across mammals despite greater label noise. We apply STEAM predict enhancers for all major developmental lineages throughout the human, mouse (HumMus) and 239 additional mammalian genomes3 (BabaGanoush), i.e. 32 × 241 = 7,712 genome-wide enhancer tracks. Together, our results unify advances in single-cell profiling, deep learning, and comparative genomics into a framework for the evolutionary transfer learning of noncoding regulatory grammars. More broadly, our work supports the view that model organisms and evolutionarily diverse genomes are indispensable resources for accelerating the AI-enabled exploration of human biology.

In addition to the bioRxiv this is also pilot for a new interactive preprint developed by @curvenote.com w/ support from @hhmi-science.bsky.social including directly embedded Jupyter notebooks for fig reproduction, data, models, prediction tracks, code, etc
shendure.curve.space/articles/evo...

1 week ago 20 2 0 0
Post image

New preprint @cxqiu.bsky.social @jshendure.bsky.social ! Can we learn regulatory grammars of human cell types — by training on mouse development and transferring across 241 mammalian genomes? Introducing STEAM & a whole-organism scATAC-seq atlas from E10 to birth.
www.biorxiv.org/content/10.6...

1 week ago 48 26 1 2
Post image

Latest from Shendure & Qiu labs (@cxqiu.bsky.social)
)! We combined a new 4M cell mouse whole embryo scATAC-seq atlas (E10-P0), millions of 'evolutionarily coherent' orthologs from 241 mammalian genomes (Zoonomia), and the CREsted CNN framework (@steinaerts.bsky.social).

1 week ago 39 16 1 0
Group leader vacancy We are looking for a Group Leader in applied artificial intelligence in (bio)medical research at VIB.AI & UGent, Belgium

We launched a new Group Leader vacancy in our Center for AI & Computational Biology - VIB.AI @vibai.bsky.social - with a Professorship at Ghent University. Join us with your most creative AI+Biology research plan! Apply before 31st May vib.ai/en/group-lea...

1 week ago 22 15 0 0

CREsted is finally out! You can find the article, together with a summarizing Research Briefing, in thread. 🦎

1 week ago 27 14 1 1
Preview
A toolkit for modeling cis-regulatory logic of enhancers at large scale - Nature Methods Deciphering the genomic regulatory code driving cell type-specific gene regulation has been a research quest for decades. We present CREsted, a software package that provides data-driven insights into...

Read the associated Research Briefing here:

www.nature.com/articles/s41...

2 weeks ago 8 1 0 0
Preview
CREsted: modeling genomic and synthetic cell-type-specific enhancers across tissues and species - Nature Methods CREsted is an efficient and user-friendly toolbox for analysis, modeling and design of cell-type-specific enhancers across diverse species.

CREsted: an efficient and user-friendly toolbox for analysis, modeling and design of cell-type-specific enhancers.

www.nature.com/articles/s41...

2 weeks ago 15 3 1 2

The @steinaerts.bsky.social lab published CREsted, an end-to-end modeling framework to
🧬 Train sequence-based enhancer models on large sc datasets
🔍 Decode enhancer logic with nucleotide-level interpretability
⚙️ Design synthetic enhancers with cell-type specificity
https://tinyurl.com/ypurmrw5

2 weeks ago 12 5 0 0
Advertisement
Post image Post image Post image Post image

Full house today for the Methusalem BioMedAI kickoff!

The labs of @steinaerts.bsky.social, @joanampereira.bsky.social, @ppjgoncalves.bsky.social & Maarten De Vos came together to launch this long-term research program on explainable and generative AI for biomedical discovery.

Let's go!

1 month ago 7 2 0 0
Post image

The @steinaerts.bsky.social lab is looking for a postdoctoral researcher to develop next-generation sequence-to-function models for glioblastoma, one of the most aggressive brain cancers.

More info & how to apply 👉 https://vib.ai/en/opportunities#/job-description/130090

2 months ago 6 11 0 0
Post image

Last summer I spent 4 months working at the @alleninstitute.org as a Visiting Scientist. Recently we released some preprints about the work we collaborated on, where from new multiome atlases of CNS regions we tried to decipher underlying enhancer logic with CREsted (among many other things). (1/n)

2 months ago 17 3 1 0
Preview
Spark

Introducing IZIKAI. ✨

My son, Juul Aerts, is on vocals and piano, and the band just dropped their debut single, "Spark."

🎧 Listen to "Spark" here: open.spotify.com/track/7D8KxZ...
📸 Follow their journey: www.instagram.com/izikai__/

#IZIKAI #ProudDad

2 months ago 4 0 0 0
Preview
The evolution of gene regulation in mammalian cerebellum development Gene regulatory changes are considered major drivers of evolutionary innovations, including the cerebellum’s expansion during human evolution, yet they remain largely unexplored. In this study, we com...

Outstanding @science.org study on the evolution of gene regulation shaping
#cerebellum development 🧪🧠🧬
@ioansarr.bsky.social @marisepp.bsky.social @tyamadat.bsky.social @steinaerts.bsky.social @kaessmannlab.bsky.social
www.science.org/doi/10.1126/...

2 months ago 30 18 1 2

Big congrats to the entire Kaessmann lab for this spectacular achievement and beautiful insights. It was a great honour to contribute to this study and to host Ioannis in our lab, an absolutely brilliant scientist. Evolution of genomic enhancers controlling neuronal cell types is just too cool..

2 months ago 18 1 2 0
Evaluating single-cell ATAC-seq atlasing technologies using sequence-to-function modeling - Nature Communications Generating high-quality training data for machine learning is costly. Here, authors include sequence-to-function modeling in benchmarking of custom and commercial droplet-based scATAC platforms, and r...

Paper alert! 💻 How many cells do you need to train reliable deep learning models in regulatory genomics? We asked how data quality, sequencing depth, and dataset size affect training of sequence-to-function models from scATAC-seq. Out now www.nature.com/articles/s41...
(details below)

2 months ago 31 15 2 1

Hydrop-v2 is now published ! Allows generating cheap scATAC-seq training data for enhancer modeling with CREsted. Make sure to check out the 600K cell atlas of the last 4 hours of Drosophila embryo development. Fun to use bioML for technology benchmarking :)

2 months ago 9 0 1 0
VIB-KU Leuven Center for Neuroscience
VIB-KU Leuven Center for Neuroscience YouTube video by VIB-KU Leuven Center for Neuroscience

🚀 Proudly introducing the VIB-KU Leuven Center For Neuroscience, a merger of the two former VIB research centers VIB-KU Leuven Center for Brain & Disease Research and Neuro-Electronics Research Flanders (NERF)! Our new motto: Bold Science, Real Impact.

www.youtube.com/watch?v=uhaq...

2 months ago 15 7 0 1

New preprint from the lab and wonderful work by Seppe de Winter:
System-wide extraction of cis-regulatory rules from sequence-to-function models in human neural development
www.biorxiv.org/content/10.6...

3 months ago 13 4 0 0
Advertisement
tSNE dimensionality reduction of facial mesenchyme TF-MINDI seqlets colored based on TF-family. The coordinator instances are circled and an arrow drawn to a PCA of those coordinator instances colored based on coordinator motif score. This shows that TF-MINDI captures multiple coordinator affinities. For each affinity bin a TF binding motif logo is shown.

tSNE dimensionality reduction of facial mesenchyme TF-MINDI seqlets colored based on TF-family. The coordinator instances are circled and an arrow drawn to a PCA of those coordinator instances colored based on coordinator motif score. This shows that TF-MINDI captures multiple coordinator affinities. For each affinity bin a TF binding motif logo is shown.

To test the sufficiency of the TF-MINDI extracted enhancer code rules we turn to synthetic enhancer design in facial mesenchyme cells. A homeobox-ebox dimer motif (Coordinator) has been shown to be instrumental for this cell type. TF-MINDI identified Coordinator instances at varying affinities.

3 months ago 1 1 1 0
A large tSNE dimensionality reduction showing PBMC TF-MINDI seqlets colored based on TF-family. This is surrounded by four smaller tSNE dimensionality reductons colored based on TF-ChIP-seq Z-score. Showing specific enrichment of TFs in TF binding sites annotated to the family of that TF. Bottom right shows ROC curve, comparing TF-MINDi based prediction of ChIP-seq signal with motif enrichment based prediction (cisTarget). This shows that TF-MINDI is more accurate.

A large tSNE dimensionality reduction showing PBMC TF-MINDI seqlets colored based on TF-family. This is surrounded by four smaller tSNE dimensionality reductons colored based on TF-ChIP-seq Z-score. Showing specific enrichment of TFs in TF binding sites annotated to the family of that TF. Bottom right shows ROC curve, comparing TF-MINDi based prediction of ChIP-seq signal with motif enrichment based prediction (cisTarget). This shows that TF-MINDI is more accurate.

We validate the TF-MINDI instances using ChIP-seq data in PBMC. Showing that TF-MINDI is more accurate compared to traditional motif enrichment analysis tools.

3 months ago 1 1 2 0

TF-MINDI is out! A new method to learn cis-regulatory codes through rich embeddings of TF binding sites. TF-MINDI decomposes motif neighbourhoods, and works downstream of any sequence-to-function deep learning model. We deeply study the enhancer code in human neural development, check out the thread

3 months ago 60 38 1 0
Preview
System-wide extraction of cis-regulatory rules from sequence-to-function models in human neural development The genomic cis-regulatory code (CRC) underlies spatiotemporal specificity of gene expression. While sequence-to-function (S2F) models can accurately encode the CRC of transcriptional enhancers, decod...

Check out the preprint: doi.org/10.64898/202... and the TF-MINDI package: github.com/aertslab/TF-MINDI. With @lukasmahieu.bsky.social ’s help this has become an amazing and user-friendly package, please give it a try and provide feedback.

3 months ago 10 2 0 0
Figure showing four panels. Top left: TF-MNDI logo (pink background and yellow letters), showing the text: "Transcription Factor Motif Instance Neighborhood Decomposition and Interpretation". Top right: TF-MINDI workflow. 1. seqlets are called (showing nucleotide level contribution scores and seqlets as blocks of nucleotides with high contribution). 2. Seqlets are embedded (showing, for each seqlet, a representation of a vector as a heatmap) and 3 seqlets are clustered and annotated (showing a schematic representation of a dimensionality reduction with seqlets colored based on TF-families as well as TF binding motif logos). Bottom left, tSNE dimensionality reduction of organoid seqlets colored based on TF family. Bottom right, similar tSNE dimensionality reduction for embryo seqlets.

Figure showing four panels. Top left: TF-MNDI logo (pink background and yellow letters), showing the text: "Transcription Factor Motif Instance Neighborhood Decomposition and Interpretation". Top right: TF-MINDI workflow. 1. seqlets are called (showing nucleotide level contribution scores and seqlets as blocks of nucleotides with high contribution). 2. Seqlets are embedded (showing, for each seqlet, a representation of a vector as a heatmap) and 3 seqlets are clustered and annotated (showing a schematic representation of a dimensionality reduction with seqlets colored based on TF-families as well as TF binding motif logos). Bottom left, tSNE dimensionality reduction of organoid seqlets colored based on TF family. Bottom right, similar tSNE dimensionality reduction for embryo seqlets.

To obtain high dimensional embeddings of S2F identified motifs, annotate TFBS across cell-type specific peaks and model TFBS co-occurrences we developed a new python package named TF-MINDI. Resulting in > 400k annotated TFBS instances across the genome (each dot in the tSNE below is one instance).

3 months ago 6 2 1 0

We are thrilled to share our new pre-print: “System-wide extraction of cis-regulatory rules from sequence-to-function models in human neural development”. S2F-deeplearning models can accurately encode enhancers, yet decoding these models into human-interpretable rules remains a major challenge.

3 months ago 44 21 1 1

TF-MINDI is out! A new method to learn cis-regulatory codes through rich embeddings of TF binding sites. TF-MINDI decomposes motif neighbourhoods, and works downstream of any sequence-to-function deep learning model. We deeply study the enhancer code in human neural development, check out the thread

3 months ago 60 38 1 0
Post image

This is the happy face of four researchers embarking on a cool scientific collaboration backed by 7-years of structural financing!

Congrats @steinaerts.bsky.social, @joanampereira.bsky.social, @ppjgoncalves.bsky.social, and Maarten De Vos on your Methusalem grant.

https://tinyurl.com/nvcardzy

3 months ago 16 3 0 0
Advertisement
Senior Bioinformatician - Biodiversity Cell Atlas Do you want to help us improve human health and understand life on Earth? Make your mark by shaping the future to enable or deliver life-changing science to solve some of humanity’s greatest challenge...

Open Senior Bioinformatician position at
@sangerinstitute.bsky.social
Tree of Life, to work on the Biodiversity Cell Atlas initiative with @marakat.bsky.social and me.

📅 Apply by January 18
🔗 sanger.wd103.myworkdayjobs.com/en-US/Wellco...

Please share with anyone who might be interested!

3 months ago 20 30 0 1
Group Leader - Generative Biology and AI Do you want to help us improve human health and understand life on Earth? Make your mark by shaping the future to enable or deliver life-changing science to solve some of humanity’s greatest challenge...

Looking to start your lab in generative biology / AI?
Come join us at the @sangerinstitute.bsky.social
Sanger is core-funded so you can generate data at scale to train the next generation of models and understanding. Design/Engineering/Chemistry/Proteins/Pathways!
pls RT
tinyurl.com/GenGenFaculty

3 months ago 33 33 0 0