Advertisement · 728 × 90

Posts by Niklas Kempynck

Really amazing work and a big effort by Olga, Alex, Julie, Koen and many others. Check it out for sure!

6 days ago 2 0 0 0
Post image

New preprint @cxqiu.bsky.social @jshendure.bsky.social ! Can we learn regulatory grammars of human cell types — by training on mouse development and transferring across 241 mammalian genomes? Introducing STEAM & a whole-organism scATAC-seq atlas from E10 to birth.
www.biorxiv.org/content/10.6...

1 week ago 49 26 1 2
Post image

Fun fact: CREsted is named after the great crested newt, which has a crested back resembling scATAC peaks. This was inspired by the (alpine) newts I occasionally encounter in my parents' garden 🤗

2 weeks ago 4 0 0 0

... and to @steinaerts.bsky.social for his guidance throughout the project.

2 weeks ago 0 0 1 0

This work was done together with @seppedewinter.bsky.social, and we’d like to thank @casblaauw.bsky.social, @lukasmahieu.bsky.social, Vasilis, @erencaneksi.bsky.social, @samdieltiens.bsky.social, @darinaabaffy.bsky.social and all the amazing co-authors for their help...

2 weeks ago 1 0 1 0

Compared to the preprint we added robustness analyses and more benchmarking of options within CREsted and of CREsted features (like motif identification) to traditional methods. We also aimed to position it well in the landscape of sequence-based modeling methods.

2 weeks ago 0 0 1 0

CREsted is finally out! You can find the article, together with a summarizing Research Briefing, in thread. 🦎

2 weeks ago 27 14 1 1
Post image Post image

Big thanks to Nelson & Trygve for guiding me, and to all the other people in the group for the nice collab. Thanks to @steinaerts.bsky.social for supporting me on this endeavor and to @fwovlaanderen.bsky.social for funding it. Also, Pacific Northwest nature is quite insane 😁

2 months ago 3 0 0 0

These studies have many more interesting analyses, so would highly recommend to check out these big efforts from all the people involved! It was great to work together with all the people in Trygve’s group, on our shared interest of trying to understand gene regulation in the brain.

2 months ago 1 0 1 0

Finally, in a study led by Yuanyuan & Nelson we dove deep into astrocytes subgroups in the BG, and pushed CREsted models to their resolution limit to learn how these subgroups differ in enhancer logic. A very fun adventure with great data and many modalities, and a nice set of enhancer tools.

2 months ago 1 0 1 0
Advertisement

Next, another big atlas release led by Nelson and Yuanyuan on the primate basal ganglia (BG), where again we described enhancer codes of the strongly conserved groups across species and checked how well the models could predict enhancer tool function.

2 months ago 1 0 1 0

First, a study led by @mtvector.bsky.social and Nelson generated a cross-species multiome atlas of the spinal cord, where we described enhancer codes of identified groups with strong conservation across species. We used our models to study enhancer tools for targeting specific cell types.

2 months ago 2 0 1 0
Post image

Last summer I spent 4 months working at the @alleninstitute.org as a Visiting Scientist. Recently we released some preprints about the work we collaborated on, where from new multiome atlases of CNS regions we tried to decipher underlying enhancer logic with CREsted (among many other things). (1/n)

2 months ago 17 3 1 0
Evaluating single-cell ATAC-seq atlasing technologies using sequence-to-function modeling - Nature Communications Generating high-quality training data for machine learning is costly. Here, authors include sequence-to-function modeling in benchmarking of custom and commercial droplet-based scATAC platforms, and r...

Paper alert! 💻 How many cells do you need to train reliable deep learning models in regulatory genomics? We asked how data quality, sequencing depth, and dataset size affect training of sequence-to-function models from scATAC-seq. Out now www.nature.com/articles/s41...
(details below)

2 months ago 31 15 2 1

We are thrilled to share our new pre-print: “System-wide extraction of cis-regulatory rules from sequence-to-function models in human neural development”. S2F-deeplearning models can accurately encode enhancers, yet decoding these models into human-interpretable rules remains a major challenge.

3 months ago 44 21 1 1
Preview
ANTIPODE Provides a Global View of Cell Type Homology and Transcriptomic Divergence in the Developing Mammalian Brain Diverse neurons and glia are generated in conserved spatial and temporal sequences during mammalian brain development. Divergence in gene regulatory networks can alter brain composition, scaling, timi...

Relieved to finally post my whole developing brain evolutionary "theory of everything" preprint!

www.biorxiv.org/content/10.1...

6 months ago 5 2 1 0

We have two open positions for a ML and a LLM engineer to launch a machine learning expertise unit in our center @vibai.bsky.social, see vib.ai/en/opportuni...

6 months ago 5 7 0 0

We will have our next community meeting on Tuesday, 2025-09-16 at 18:00 CEST! Niklas Kempynck will be presenting on CREsted, a package for training enhancer models on scATAC-seq data.
(Zoom registration link and more information in thread!)
🧵

7 months ago 7 2 1 1
Preview
Tomtom-lite: Accelerating Tomtom enables large-scale and real-time motif similarity scoring Summary Pairwise sequence similarity is a core operation in genomic analysis, yet most attention has been given to sequences made up of discrete characters. With the growing prevalence of machine lear...

I wrote a quick application note on Tomtom-lite, a Python implementation of the Tomtom algorithm for comparing PWMs against each other. This implementation can be 10-1000x faster and, as a Python function, can be integrated into your workflows easier.

www.biorxiv.org/content/10.1...

10 months ago 58 18 2 2
Advertisement

One thousand candidate enhancers tested in vivo in the mouse brain! A massive resource and oh so useful as validation set for genome-wide enhancer prediction methods. Super fun to be involved in one of the papers: ‘the prediction challenge paper’ by Nelson&Niklas et al www.cell.com/cell-genomic...

11 months ago 43 13 0 0

Make sure to also check out the other studies part of the larger effort on identifying and validating enhancer tools.

11 months ago 0 0 0 0

This study was done together with Nelson Johansen and supervised by Trygve Bakken at the @alleninstitute.org. Thanks to all co-authors for the great inter-lab collaboration! Also a personal shoutout to the members in @steinaerts.bsky.social lab for a nice team effort and to Stein for guidance.

11 months ago 2 0 1 0
Preview
Evaluating methods for the prediction of cell-type-specific enhancers in the mammalian cortex Johansen et al. report the results of a community challenge to predict functional enhancers targeting specific brain cell types. By comparing multi-omics machine learning approaches using in vivo data...

Check out our work on evaluating methods for predicting in vivo cell enhancer activity in the mouse cortex! Combined, scATAC peak specificity and sequence-based CREsted predictions gave the best predictive performance, aiming to advance genetic tool design for cell targeting in the brain.

11 months ago 20 10 1 0
Preview
Intelligence Evolved at Least Twice in Vertebrate Animals | Quanta Magazine Complex neural circuits likely arose independently in birds and mammals, suggesting that vertebrates evolved intelligence multiple times.

Calling someone bird-brained is, in fact, a way of calling someone highly intelligent. @yaseminsaplakoglu.bsky.social reports: www.quantamagazine.org/intelligence...

1 year ago 91 26 2 5
Preview
CREsted: modeling genomic and synthetic cell type-specific enhancers across tissues and species Sequence-based deep learning models have become the state of the art for the analysis of the genomic regulatory code. Particularly for transcriptional enhancers, deep learning models excel at decipher...

Very proud of two new preprints from the lab:
1) CREsted: to train sequence-to-function deep learning models on scATAC-seq atlases, and use them to decipher enhancer logic and design synthetic enhancers. This has been a wonderful lab-wide collaborative effort. www.biorxiv.org/content/10.1...

1 year ago 109 39 5 1

Also check out Hannah’s thread on our latest preprint on HyDrop v2, an open-source platform for scATAC-sequencing, and a great, cost-efficient way of generating data for S2F models. 🙌

1 year ago 7 1 0 0
Post image

CREsted is available at github.com/aertslab/CRE.... Analysis notebooks can be found at github.com/aertslab/CRE.... All models developed for this preprint and in previous work are available in CREsted through crested.get_model(). We look forward to your feedback!

1 year ago 2 1 0 0

This was a big collaborative effort, together with @seppedewinter.bsky.social , and with great contributions from @casblaauw.bsky.social , Vasilis and many others. A special shoutout to @lukasmahieu.bsky.social who professionalized the package, and to @steinaerts.bsky.social for supervising.

1 year ago 1 0 1 0
Post image

Finally, we train a model on a full-development zebrafish scATAC-seq atlas, and use it to design and in vivo validate cell type- and timepoint-specific enhancers with a high success rate. We also attempt to modulate reporter strength over two cell types.

1 year ago 3 0 1 0
Advertisement
Post image

In a new functionality to CREsted, we explore Borzoi fine-tuning to mouse motor cortex scATAC-seq data. We show that fine-tuned models and smaller models from scratch have a near-identical performance.

1 year ago 1 0 1 0