Wouter van Amsterdam (@vanamsterdam) Bsky

work with

Diantha Schipaanboord, Floor B.H. van der Zalm, René van Es, Melle Vessies, Rutger R. van de Leur, Klaske R. Siegersma, Pim van der Harst, Hester M. den Ruijter, N. Charlotte Onland-Moret, on behalf of the IMPRESS consortium

7 months ago 0 0 0 0

ECG classification with convolutional neural networks demonstrates resilience to sex-imbalances in data Background: Many ECG-AI models have been developed to predict a wide range of cardiovascular outcomes. The underrepresentation of women in cardiovascular disease studies has raised concerns if these m...

Conclusion: The convolutional neural networks in this study demonstrated resilience to simulated sex-imbalance in training ECG data.

pre-print: doi.org/10.1101/2025...

7 months ago 2 0 1 0

Discrimination remained stable across sexes; only calibration shifted in extreme scenarios when prevalence differed by sex, with similar patterns for women and men.

7 months ago 0 0 1 0

Using ~165k ECGs, we simulated sex-imbalances in representation (women-to-men ratio), outcome prevalence, and misclassification in the training data for LBBB, long QT syndrome, LVH, and physician-labeled “abnormal” ECGs.

7 months ago 0 0 1 0

Pre-print alert:
Many ECG-AI models have been developed to predict a wide range of cardiovascular outcomes. But, underrepresentation of women in cardiovascular studies raises the question: Are ECG-AI models equally predictive for women and men with sex-imbalanced training data?

7 months ago 2 0 1 0

The Risks of Risk Assessment: Causal Blind Spots When Using Prediction Models for Treatment Decisions | Annals of Internal Medicine Clinicians increasingly rely on prediction models to guide treatment choices. Most prediction models, however, are developed using observational data that include some patients who have already receiv...

New paper in @annalsofim.bsky.social

"50 ways to misinterpret clinical prediction models for treatment decisions”

--> Published version: www.acpjournals.org/doi/10.7326/...

--> Open access version: arxiv.org/pdf/2402.17366

8 months ago 17 10 2 0

Hans van Houwelingen award ceremony and symposium June 19th 2025 - VVSOR This spring, the BMS-ANed organises an in-person meeting:

BMS-ANed Spring Meeting on Thursday, June 19
Time: 13:00–18:00 (CEST)
Location: Vredenburg 19, 3511 BB, Utrecht
Details and registration: vvsor.nl/biometrics/e...

10 months ago 1 1 0 0

Still some spots available in our summer school on all things causal inference, 7-11 July in Utrecht! Discounts for those working in universities and non-profits, and affordable accommodation offered by @utrechtuniversity.bsky.social summer school!

11 months ago 7 6 1 0

Even if you model a physical system, e.g. avg yearly temperature depending on height, and assume that temp given height is the same everywhere. If you invert it into predicting presence of mountain given temp, you’ll find varying discrimination in diff countries. Example from scholkopf’s talks

11 months ago 0 0 0 0

You’ve modeled a system with no meaningful variation across environments. The model may be reliable in the tested environments but you haven’t shown robustness against variation in distributions as you haven’t observed any

11 months ago 0 0 2 0

A causal viewpoint on prediction model performance under changes in case-mix: discrimination and calibration respond differently for prognosis and diagnosis predictions Prediction models need reliable predictive performance as they inform clinical decisions, aiding in diagnosis, prognosis, and treatment planning. The predictive performance of these models is typicall...

A question that remains is how these differences in environments may come about and what to do with this in practice? On this, I wrote a paper titled, available here: arxiv.org/abs/2409.01444

fin!

11 months ago 2 1 0 0

if the distribution of outcome given features remains the same (Y|X), calibration is preserved. If both are the same, the environments were not meaningfully different to begin with!

a more lengthy explanation is in this blog post: wvanamsterdam.com/posts/250425...

11 months ago 1 1 2 0

as promised (so all of you can breathe normally again), here's my TLDR answer:

Environments must differ with respect to something. If the distribution of features given outcome remains the same (X|Y), discrimination is preserved;

11 months ago 0 0 1 0

tagging some prediction modelers / statisticians, @maartenvsmeden.bsky.social @benvancalster.bsky.social @gelovennan.bsky.social @f2harrell.bsky.social @lucystats.bsky.social @miguelhernan.org @gscollins.bsky.social

(I will answer tomorrow)

11 months ago 0 0 0 0

Which is stronger evidence for robustness?

When evaluating predictive performance of one model in several different environments (e.g. regions / hospitals):

A. stable discrimination (AUC) and calibration in all environments
B. stable discrimination, varying calibration

vote with 👍=A; ❤️=B

11 months ago 1 1 3 0

ask chatGPT o3 this before submitting your next paper to, I got ~10 usable comments out of it:

you're a reviewer for <journal>; review the attached paper when you're either:

11 months ago 2 0 0 0

what are the exceptions?

1 year ago 0 0 1 0

Individual treatment effect estimation in the presence of unobserved confounding using proxies: a cohort study in stage III non-small cell lung cancer - Scientific Reports Scientific Reports - Individual treatment effect estimation in the presence of unobserved confounding using proxies: a cohort study in stage III non-small cell lung cancer

2. an external reproduction of the PROTECT method from Manchester University with Charlie Cuniffe, Matt Sperrin and Gareth Price (www.nature.com/articles/s41...)

3. a 'causal' meta-analysis method using only aggregate data, exciting work with Qingyang Shi from Groningen University

1 year ago 1 0 0 0

A causal viewpoint on prediction model performance under changes in case-mix: discrimination and calibration respond differently for prognosis and diagnosis predictions Prediction models inform important clinical decisions, aiding in diagnosis, prognosis, and treatment planning. The predictive performance of these models is typically assessed through discrimination a...

Very excited for my first (belated) visit to #EuroCIM2025!

I'm here with 3 bits of work:

1. a poster on a causal understanding of prediction model performance under shifts in 'case-mix' (or covariate / outcome drift); I show how discrimination and calibration respond differently
bit.ly/ccm-arxiv

1 year ago 10 2 2 0

An Overview of Large Language Models for Statisticians Large Language Models (LLMs) have emerged as transformative tools in artificial intelligence (AI), exhibiting remarkable capabilities across diverse tasks such as text generation, reasoning, and decis...

this seems pretty cool: an overview of llms for statisticians

arxiv.org/abs/2502.17814

1 year ago 4 0 0 0

Postdoc Biomedical Data Scientist / Biostatistician | LUMC In this postdoc position at LUMC, you will work on groundbreaking research that enhances the transparency and trustworthiness of decision support algorithms in healthcare. This position allows you to ...

Vacancy for a postdoc position.

Improve the transparency of decision support algorithms by figuring out how we can quantify and communicate uncertainty in individual causal predictions.

With Marleen Kunneman, Daniala Weir and me.
Three more days to apply 👇

www.lumc.nl/en/about-lum...

1 year ago 6 5 0 0

Building in the physics is one way to potentially get the right causal mechanisms

In sofar as the model is trained on real world patient data, you'll still have to ensure no biases e.g. related to confounding creep in

1 year ago 2 0 1 0

Digital twins are useful insofar as they reflect causal mechanisms

Don't think a generative model ('digital twin') can inform treatment decisions just because it procudes different outputs when you give it different inputs. Doesn't matter if it's 'AI' or not.

1 year ago 6 0 1 0

saliency maps are the new table 2 fallacy

1 year ago 0 0 0 0

Not sure about overfitting, results seemed robust to 5-site cross validation.

It just learns correlations, what's wrong with that? The words 'confounders' and 'bias' make it sound they expected the model to yield some causal understanding. Maybe these heatmaps are the new table 2 fallacy

1 year ago 2 0 1 0

Awesome, congrats!

1 year ago 1 0 1 0

Liking this interaction with @mmbronstein.bsky.social and Denis Danilov so much I'm reposting it here

1 year ago 37 4 3 0

Introduction to Causal Inference and Causal Data Science | Utrecht Summer School The course takes an interdisciplinary approach and is suitable for applied researchers across health, social and behavioural sciences.

Interested in how to use non-experimental data to answer causal research questions? Mystified by DAGs and counterfactuals? Want to learn what Target Trial Emulation is all about?

Sign up now for the 2nd edition of our summer school, 7-11 July in Utrecht, with @vanamsterdam.bsky.social & BPdeVries

1 year ago 54 14 1 3

Probably more like "the average of an infinite sequence of throws hits the bulls eye"

1 year ago 1 0 1 0

@oisinryan.bsky.social and I are developing a julia package for target trial emulation with a student, happy to be added to the list

1 year ago 3 0 0 0

Posts by Wouter van Amsterdam