Pia Rautenstrauch (@prauten) Bsky

🌟 Applications for the 2026 Leena Peltonen School of Human Genetics are open!

Back after a great 2025 edition: ~20 global leaders and ~20 PhD students shaping the future of genomics.

📅 July 26–30, 2026
📍 Wellcome Genome Campus, UK
📝 Apply by March 6 → www.lpshg.com

4 months ago 9 8 0 0

Thrilled to announce that I’ve just opened my research lab at @helmholtz-hiri.bsky.social !
A huge thank you to my mentor Roi Avraham (and lab members!) for the incredible training and support that made this possible.

2 months ago 25 13 2 1

There is the obvious fraud potential, but I think something much more fundamental is going to happen, too. It will change the meaning of what a "research publication" is. The reason why people read papers is to learn something that they couldn't reproduce in seconds by pushing a few buttons.

2 months ago 15 4 1 0

How many high-impact developmental variants are we missing by relying only on adult splicing annotations?
We address this in our preprint “Aberrant splicing prediction during human organ development”: www.biorxiv.org/content/10.1...

2 months ago 8 6 1 0

LinkedIn This link will take you to a page that’s not on LinkedIn

I'm excited to announce we're recruiting a PhD student in Machine Learning for Immunology within the Einstein Center for Early Disease Interception, together with Simon Haas!

3 months ago 4 3 1 0

🧠 The Lipid #Brain Atlas is out now! If you think #lipids are boring and membranes are all the same, prepare to be surprised. Led by @lucafusarbassini.bsky.social with Giovanni D'Angelo's lab, we mapped membrane lipids in the mouse brain at high resolution.
www.biorxiv.org/cgi/content/...

6 months ago 283 110 7 11

We are excited to share GPN-Star, a cost-effective, biologically grounded genomic language modeling framework that achieves state-of-the-art performance across a wide range of variant effect prediction tasks relevant to human genetics.
www.biorxiv.org/content/10.1...
(1/n)

7 months ago 174 91 4 5

I did not know Taylor Swift was moonlighting in soliciting contributions for fake journals!

7 months ago 9 1 0 0

Check out my talented colleagues' study, profiling hundreds of CRISPRa-responsive regulatory elements surrounding PHOX2B, a key player in neuroblastoma, using a targeted scRNA-seq screen in a neuroblastoma cell line.

7 months ago 1 0 0 0

Meeting agenda Sep 10, 2025 Attendees: Links: Agenda (feel free to add your items): • Blog almost ready for R blogger linkage thanks to @Izabela Mamede, @Mengyuan Shen and @Maria Doyle • New posts from many including @Juan Henao and myself • Ideas for other posts? • There is tidybulk v2 ready to be submitted. Some feedback would be nice there. • Stefano's new speedy code in tidySE • https://github.com/tidyomics/ genomics-todos/issues/19#is suecomment-3239791713 • https://github.com/tidyomics/t idySummarizedExperiment/i ssues/106 • Report back from tidyomics workshop at useR! (Justin and Mike) • Other projects in the works? • Ideas for engaging new users? New developers?

These are the corresponding times for your meeting: Location Local Time Durham (USA - North Carolina) Wednesday, September 10, 2025 at 6:00:00 am Adelaide (Australia - South Australia) Wednesday, September 10, 2025 at 7:30:00 pm Paris (France - Paris) | Wednesday, September 10, 2025 at 12:00:00 noon Corresponding UTC (GMT) Wednesday, September 10, 2025 at 10:00:00

Our first Fall #tidyomics meeting will be this Wed 10 September, early in US / noon in Europe / late in Australia. Feel free to join if you're interested in what we are doing to make omics data more amenable to tidy data analysis.

Organized with Stefano @stemang.bsky.social

7 months ago 14 4 1 1

L’effet Matilda n’est pas une fiction.
Il est inscrit dans l’histoire scientifique.
Il a éclipsé des femmes comme Marthe Gautier, née il y a cent ans, pionnière oubliée de la trisomie 21.
➡️ https://l.franceculture.fr/1LI

7 months ago 489 298 8 13

Cross-biobank generalizability and accuracy of electronic health record-based predictors compared to polygenic scores - Nature Genetics Comparison of electronic health record-based phenotype risk scores (PheRS) and polygenic scores (PGS) across 13 common diseases and three biobank-based studies indicates that PheRS and PGS may provide...

Are electronic health records (EHR) more predictive of disease onset than polygenic scores? Can we transfer EHR-based prediction models between countries? Our study on these questions using 3 biobank-based studies with N>845K, is out in @natgenet.nature.com today:

www.nature.com/articles/s41...

7 months ago 30 12 3 2

The participants of Dagstuhl Seminar 24122 standing on steps outside (from https://www.dagstuhl.de/24122)

Multiple types of embeddings (UMAP, t-SNE, Laplacian Eigenmaps, PHATE, PCA, MDS) of Wikipedia text data labelled by a text summaries generated by an LLM. Methods like UMAP and t-SNE show cluster structure that reflect shared subject matter in text, whiel other methods show more continuous structure.

Multiple embedding methods (PCA, Laplacian Eigenmaps, t-SNE, MDS, PHATE, UMAP) of primate brain organoids at different time periods. Different methods highlight different aspects of development, such as clusters of similar cell types or time courses of cell development.

Multiple embedding methods (PCA, Laplacian Eigenmaps, t-SNE, MDS, PHATE, UMAP) of 1000 Genomes Project genotypes. Different methods reflect different aspects of demographic history of populations.

Last year I met a bunch of great researchers who work with high-dimensional data at a Dagstuhl seminar. This week we put out a preprint about the history and philosophy of low-dimensional embedding methods, their applications, their challenges, and their possible future arxiv.org/abs/2508.15929

7 months ago 15 7 1 1

We spent a year writing this review of low-dim embeddings and arguing about things like epistemic roles and best practices :-) 20+ authors are all participants of the Dagstuhl seminar we held last year: www.dagstuhl.de/24122. Led by @alexandr.bsky.social and Cyril de Bodt.

arxiv.org/abs/2508.15929

7 months ago 27 9 1 0

We're committed to support as many attendees as possible join us at #scverse2025 - feel free to reach out if you have questions!

7 months ago 4 3 0 0

https://authors.elsevier.com/a/1lbX08YyDfuZWX

Antibodies are highly diverse, but most possible sequences are unstable or polyreactive. In this work, just published in Cell Syst., we propose a new source of data for modeling constraints from these properties. Our models show clear improvements in predicting Ab dysfunction. (1/n)
t.co/qCZERPUMPF

8 months ago 16 6 1 0

Thanks, @paubadiam.bsky.social! That makes sense. Excited for the results 🔎.

8 months ago 0 0 0 0

Very well set up benchmark and informative comparisons! I might have missed it, but did you also compare the performance of the same methods using either truly paired vs synthetically paired multimodal data as input in terms of your performance evaluation metrics, in addition to network consistency?

8 months ago 0 0 1 0

By now, I’ve heard from many people who’ve noticed inconsistencies when using silhouette-based metrics for horizontal data integration evaluation. I hope we’ve helped shed light on why these metrics fall short and that our recommendations prove useful to you!

8 months ago 4 0 0 0

Excited to share our latest paper @natmethods.nature.com
We present a high-throughput framework to map cellular interactions at ultra-high scale – broadly applicable from whole-organism immune response mapping to personalized therapy response prediction (1/4).
www.nature.com/articles/s41...

8 months ago 34 14 3 0

Protein language models reveal evolutionary constraints on synonymous codon choice Evolution has shaped the genetic code, with subtle pressures leading to preferences for some synonymous codons over others. Codons are translated at different speeds by the ribosome, imposing constrai...

This preprint from Helen Sakharova is one of the coolest things to come out of my lab: “Protein language models reveal evolutionary constraints on synonymous codon choice.” Codon choice is a big puzzle in how information is encoded in genomes, and we have a new angle. www.biorxiv.org/content/10.1...

8 months ago 216 84 6 4

Lucky to have inspiring and supportive mentors by my side! @mikelove.bsky.social

8 months ago 1 0 0 0

Evaluating something like batch correction requires looking at the data, and picking metrics that capture what you care about. Great work @prauten.bsky.social and @uweohler.bsky.social

8 months ago 25 8 1 0

Shortcomings of silhouette in single-cell integration benchmarking - Nature Biotechnology Silhouette score is unsuitable as a metric for single-cell data integration.

Shortcomings of silhouette in single-cell integration benchmarking - @uweohler.bsky.social @prauten.bsky.social @mdc-bimsb.bsky.social @mdc-berlin.bsky.social @humboldtuni.bsky.social go.nature.com/4fcQzZr

8 months ago 32 13 1 2

Truly grateful for the exceptional opportunity to participate in #LPSHG2025 last week, featuring a stellar ✨ lineup of leading researchers who doubled as tutors, alongside inspiring fellow PhD students. Excited to apply my learnings and see where this collaborative spirit takes genomics next!

8 months ago 10 1 0 0

*Easter egg alert* NOT in the published paper. We also benchmarked Evo 2 and while it did better than other gLMs (consistent that scale can improve gLMs), it still falls short of a basic CNN trained using one-hot sequences and far short of supervised SOTA

9 months ago 26 5 1 0

The duplication crisis: the other replication crisis How bad publishing incentives hinder long-term thinking in computational biology research

The duplication crisis: the other replication crisis - www.worksinprogress.news/p/the-duplic...

10 months ago 3 2 0 0

The deadline for the VIB.AI group leader positions is approaching - send in your CV and short research plan before 14th June to start your BioML research lab in Leuven or Ghent

10 months ago 10 10 0 0

Excited to share my first contribution here at Illumina! We developed PromoterAI, a deep neural network that accurately identifies non-coding promoter variants that disrupt gene expression.🧵 (1/)

10 months ago 60 21 1 1

We finally concluded the meeting. Thanks to all attendees for their scientific contributions and for traveling (near or far) to the meeting! Thanks to the local organizers for the infrastructure and catering, and thanks to the co-organizers @yaronorenstein.bsky.social @camillemrcht.bsky.social!

11 months ago 26 11 1 0

Posts by Pia Rautenstrauch