Advertisement · 728 × 90

Posts by Felix Thoemmes

A postcard flyer. On the top right, there is a whimsical beach scene featuring fuzzy, colourful creatures wearing hex sticker badges with the letter 'R' in the center. The creatures have their luggage and are enjoying themselves with a cocktail and inspecting a map.

A postcard flyer. On the top right, there is a whimsical beach scene featuring fuzzy, colourful creatures wearing hex sticker badges with the letter 'R' in the center. The creatures have their luggage and are enjoying themselves with a cocktail and inspecting a map.

The second page of the postcard flyer. On the top right, there is series of wavy blue lines to resemble a postage processing stamp. On the top, there is a "our panel" header . In the center, there are four circular photographs of four individuals. Their names below their photographs.

The second page of the postcard flyer. On the top right, there is series of wavy blue lines to resemble a postage processing stamp. On the top, there is a "our panel" header . In the center, there are four circular photographs of four individuals. Their names below their photographs.

📮You've got mail! 💌

We are thrilled to invite you to "Where do R packages live?" a discussion with folks from the #R foundation, #CRAN #Bioconductor #R-universe and @posit.co. Let's learn from our panel about the ins-and-outs of where you might want to create a home for your R software 📦

1/n

12 hours ago 13 5 2 0

1/
"Silicon samples" are becoming more and more common in research and polling.

One problem: depending on the analytic decisions made, you can basically get these samples to show any effect you want.

The updated version of this preprint is now online!

THREAD🧵

arxiv.org/abs/2509.13397

1 day ago 84 42 5 4
Welcome to Manuscript.
An imagined product. Inspired by Claude Design, mocked up for academic research.
If Manuscript existed, I could open it and find every part of a study in one place. The instruments I built — surveys, protocols, consent forms. The papers I read in the run-up. The data those instruments collected. The figures I drew from it. And at the center of it all, the paper itself.

What follows is a walk through what that might look like. None of it works.

Welcome to Manuscript. An imagined product. Inspired by Claude Design, mocked up for academic research. If Manuscript existed, I could open it and find every part of a study in one place. The instruments I built — surveys, protocols, consent forms. The papers I read in the run-up. The data those instruments collected. The figures I drew from it. And at the center of it all, the paper itself. What follows is a walk through what that might look like. None of it works.

The release of #ClaudeDesign left me wondering what a tool like this could look like for academics (rather than designers)

I then used Claude Design to mock it up for me

Let me show you what came of it... 🧪🤖🧵
rafaelmbatista.com/manuscript/i...

2 days ago 13 3 1 0
Preview
PhD Student in Meta-Science and Clinical Psychology - Universität Bern Universität Bern is looking for PhD Student in Meta-Science and Clinical Psychology

I’m hiring a PhD student!

The candidate will work alongside @zefreeman.bsky.social, who is joining our research group as postdoc.

jobs.unibe.ch/job-vacancie...

3 days ago 67 55 3 8
A Unified Dashboard and Orchestrator for Quality Checks¶

Unit tests · Data validation · Linting · Spelling

Run every quality check on your project using a single command: unit tests, data validation, linters, spell checkers. Scrutin watches for edits, figures out which checks are affected, and re-runs them in parallel. Drill into a failure to see the expected and actual values, as well as the relevant source code. Use quick keystrokes to fix linting and spelling issues, or to open files in your editor of choice.

A Unified Dashboard and Orchestrator for Quality Checks¶ Unit tests · Data validation · Linting · Spelling Run every quality check on your project using a single command: unit tests, data validation, linters, spell checkers. Scrutin watches for edits, figures out which checks are affected, and re-runs them in parallel. Drill into a failure to see the expected and actual values, as well as the relevant source code. Use quick keystrokes to fix linting and spelling issues, or to open files in your editor of choice.

🚨 #RStats and #PyData devs!

I'm looking for β testers for this thing I just built: A unified dashboard + orchestrator for code and data quality checks.

It has lots of neat features and I'm super eager for feedback and bug reports.

Check out the video demo:

vincentarelbundock.github.io/scrutin/

4 days ago 23 7 0 0
Post image

#Preprint fact-checking
🥇 Scoop protection: Allow you to establish priority
📰 Journal compatible: Many journals operate policies
compatible with preprints
💪 Preprints are good quality
🛣️ Smoother path to publication: Many journals allow preprint transfers
from servers

4 days ago 3 3 1 0

Had a wonderful time tuning in to the speakers 👏 Super awesome to see such a diverse line up! I particularly loved Harriet's doodle of @hadley.nz. The earrings are on point 🤌

#dataviz #ggplot2 #opensource

5 days ago 20 5 0 0

Slides for these talks are at: harriet-mason.github.io/talk-dicook2... , cynthiahqy.github.io/talk_SSA-202... , danyangdai.github.io/SSA-Di-Cook-... . The video will be posted soon.

1 week ago 10 1 0 1
Preview
Arguing with economists: the case for preregistration At a recent economics conference there was a long discussion on preregistration where we heard questions and comments from ~20 different people at varying levels of seniority.

Recently, I got quite frustrated listening to a large group of economists discussing preregistration and pre analysis plans. So, I've channelled that into an argument for why they should be preregistering their work whenever performing confirmatory research

kdoroc.substack.com/p/arguing-wi...

6 days ago 6 7 2 1

I'm planning on a GIANT roxygen2 release in the near future (tons of bug fixes, improved R6 support, new S7 support, ...) so if you use for your packages, I'd really appreciate you trying it out and letting me know if you see any problems! github.com/r-lib/roxyge... #rstats

5 days ago 57 17 1 0
Advertisement
Preview
Postdoc In Meta-science Personal type: Scientific staff

We are inviting applications for a two-year postdoctoral position in a collaborative meta-science project on the effectiveness of data and code sharing policies in research-performing organizations. www.tue.nl/en/working-a...

6 days ago 52 66 1 0

If anyone is looking for a lab manager or RA (full or part-time), let me know. One of the most meticulous and detailed people that I know is looking for a position. They did a M.A. in psych with me here at Cornell.

6 days ago 13 8 3 0

tinyrox is on CRAN!

Minimal roxygen2 alternative. Zero non-base R deps, generates your .Rd files and NAMESPACE

`install.packages("tinyrox")`

Part of the tinyverse toolchain:
cornball.ai/posts/tinyve...

6 days ago 9 3 0 2
Post image

In France, if you want to build a home above a certain size, you’re legally required to use a licensed architect.

Can you guess what that size is

6 days ago 578 97 13 17
Title page and abstract. Abstract text: "Making decisions regarding data processing and analysis are crucial steps toward extracting insights from data in clinical trials. Trial registries like clinicaltrials.gov promote transparency about these decisions and encourage making them in advance. However, clinical studies often face decisions with multiple reasonable options outside the bounds of preregistration, such as when studies conduct post hoc analyses, deviate from preregistered plans, or simply were not preregistered. Additionally, even a priori decisions often have multiple reasonable options from which to choose. Methods that maximize transparency and minimize bias in such situations are needed. This paper advocates for applying a “multiverse” approach to analyzing such data from clinical trials. The multiverse approach simultaneously selects and analyzes the various reasonable options for each decision and presents results across all analysis “universes.” We highlight common challenges and decisions when analyzing clinical trial data, review and expand upon the multiverse approach and show how it can address these challenges, and demonstrate the approach using data from a small randomized psychotherapy trial for posttraumatic stress disorder. In the example presented, results were fully consistent across the multiverse for one outcome (posttraumatic stress symptoms), partially consistent for another (relationship satisfaction), and mostly inconsistent for a third outcome (fear of intimacy). The multiverse approach is a flexible and transparent analysis option for clinical trials in the presence of uncertainty regarding data processing and analytic choices."

Title page and abstract. Abstract text: "Making decisions regarding data processing and analysis are crucial steps toward extracting insights from data in clinical trials. Trial registries like clinicaltrials.gov promote transparency about these decisions and encourage making them in advance. However, clinical studies often face decisions with multiple reasonable options outside the bounds of preregistration, such as when studies conduct post hoc analyses, deviate from preregistered plans, or simply were not preregistered. Additionally, even a priori decisions often have multiple reasonable options from which to choose. Methods that maximize transparency and minimize bias in such situations are needed. This paper advocates for applying a “multiverse” approach to analyzing such data from clinical trials. The multiverse approach simultaneously selects and analyzes the various reasonable options for each decision and presents results across all analysis “universes.” We highlight common challenges and decisions when analyzing clinical trial data, review and expand upon the multiverse approach and show how it can address these challenges, and demonstrate the approach using data from a small randomized psychotherapy trial for posttraumatic stress disorder. In the example presented, results were fully consistent across the multiverse for one outcome (posttraumatic stress symptoms), partially consistent for another (relationship satisfaction), and mostly inconsistent for a third outcome (fear of intimacy). The multiverse approach is a flexible and transparent analysis option for clinical trials in the presence of uncertainty regarding data processing and analytic choices."

1/7 Clinical trials should preregister analyses. But even careful plans leave room for defensible alternatives, and problems often arise that create decisions no one anticipated. Instead of picking just one defensible option, what if you ran them all? New paper in @collabrapsychology.bsky.social

1 week ago 5 4 1 1
Preview
What's next: Quarto 2 We've started working on [`quarto-dev/q2`](https://github.com/quarto-dev/q2/), a full rewrite of Quarto in Rust.

Quarto 2 is coming, and it’s a total rewrite in Rust. 🦀

The headline feature? Native collaborative editing. Don't choose between Google Docs ease and Git rigor! You get real-time, conflict-free collab directly in your .qmd files. #RStats #Python

Coming soon! ✨ opensource.posit.co/blog/2026-04...

6 days ago 86 19 2 3

making all the p-hackers look like rookies

6 days ago 85 11 2 0
Side-by-side image. Left: the first page of an article in T.E.R.M. (Teaching Educational Research Methods), Volume 1 Issue 1 (Spring 2026), titled ‘The Five Gs for Teaching Statistics: Greek, Graphs, Grammar, Gadgets, and Games’ by Andrew Dean Ho (Harvard Graduate School of Education), with the abstract visible. Right: a photo of a pegboard-style learning "gadget" with colored pins and elastic bands forming a scatterplot, with a best-fit regression line as a dowel and rubber bands representing ordinary least squares residuals.

Side-by-side image. Left: the first page of an article in T.E.R.M. (Teaching Educational Research Methods), Volume 1 Issue 1 (Spring 2026), titled ‘The Five Gs for Teaching Statistics: Greek, Graphs, Grammar, Gadgets, and Games’ by Andrew Dean Ho (Harvard Graduate School of Education), with the abstract visible. Right: a photo of a pegboard-style learning "gadget" with colored pins and elastic bands forming a scatterplot, with a best-fit regression line as a dowel and rubber bands representing ordinary least squares residuals.

I wrote about how I teach statistics. As I redesign for the AI era, I won't forget the benefits of multimodal, tangible representations.
The Five Gs: Greek, Graphs, Grammar, Gadgets, and Games.
In the new journal, Teaching Educational Research Methods: doi.org/10.5149/term...

1 week ago 22 4 2 1
Post image

Survey research is often interpreted as showing that belief in conspiracy theories can be surprisingly widespread, including belief in conspiracy theories that would be astonishing if true. For example, in The Atlantic we learn that “12 million Americans believe lizard people run our country”

1 week ago 80 36 3 11
Advertisement

Important reminder why survey results examining the % of people who believe in conspiracies should not directly be interpreted as the % of people who believe in conspiracies. Respondents don't all answer seriously, so the news that 12 million people believe in lizard people is not accurate.

1 week ago 18 3 1 0
Postdoctoral Research Fellow (100%, E 13 TV-L)

Post Doc at Uni Tübingen! 100% position for 3 (+3) years; they're looking for somebody to analyze large-scale longitudinal datasets in education research.

Expertise in machine learning is an advantage, commitment to research transparency desirable 😌 proficiency in German beneficial but not required

1 week ago 56 51 2 0

PDF-Direct Firefox extension ('Bypass "fancy online reader" and just get the darn pdf') updated: addons.mozilla.org/en-US/firefo... & now supports

ACM
ACS
Cambridge University Press
Elsevier
IEEE
JSTOR
Nature
OUP
PLOS
PNAS
Royal Society Publishing
Sage
Science
Springer Nature
Taylor & Francis
Wiley

1 week ago 33 11 2 1
Preview
When Being Right Less Than Half the Time Is … Fine Let’s say you do a job that involves making predictions about human behavior — you manage money, you sell things, you write opinion columns. Just less than half of your predictions turn out to be more...

"the replication findings reinforce lessons that I have slowly been learning over the years ...'the issue needing to be solved is overconfidence..We tend to act as if published findings are replicable without actually assessing whether they are.'”

www.bloomberg.com/opinion/arti...

1 week ago 14 5 0 0
Post image

First pass at a website that keeps track of the data availability status of various published papers:

paulgp.com/replication-...

Very much in Alpha mode, so data will be updated soon.

1 week ago 92 22 10 1
An interactive OJS playground demonstrating a linear congruential generator (LCG) using the formula X_n = (aX_{n-1} + c) mod m. Controls on the left set modulus (m=8), multiplier (a=5), increment (c=3), seed (X_0=1), and numbers to generate (12). A table on the right shows the resulting sequence of X values, intermediate calculations, mod m results, and normalized values X_n/m, with the final "random" numbers highlighted in yellow.

An interactive OJS playground demonstrating a linear congruential generator (LCG) using the formula X_n = (aX_{n-1} + c) mod m. Controls on the left set modulus (m=8), multiplier (a=5), increment (c=3), seed (X_0=1), and numbers to generate (12). A table on the right shows the resulting sequence of X values, intermediate calculations, mod m results, and normalized values X_n/m, with the final "random" numbers highlighted in yellow.

Excerpt from the blog post with R code that tests all seeds from 1 to 10,000 to find which ones produce 10 heads in a row when simulating coin flips. The possible_seeds data frame is filtered to show 10 seeds (614, 1667, 3212, 4166, 4580, 5527, 5824, 7365, 7468, 8975) that meet this criterion. The post notes that seed 614 actually produces 13 heads in a row, confirmed with a withr::with_seed(614, ...) call below.

Excerpt from the blog post with R code that tests all seeds from 1 to 10,000 to find which ones produce 10 heads in a row when simulating coin flips. The possible_seeds data frame is filtered to show 10 seeds (614, 1667, 3212, 4166, 4580, 5527, 5824, 7365, 7468, 8975) that meet this criterion. The post notes that seed 614 actually produces 13 heads in a row, confirmed with a withr::with_seed(614, ...) call below.

R console output demonstrating that set.seed(1234) produces reproducible results. The first block calls runif(5) and returns five values: 0.1137, 0.6223, 0.6093, 0.6234, 0.8609. The second block uses the same seed but splits the draw into runif(2) then runif(3), returning the same five values in the same order, showing that the sequence is preserved regardless of how many numbers are drawn at a time.

R console output demonstrating that set.seed(1234) produces reproducible results. The first block calls runif(5) and returns five values: 0.1137, 0.6223, 0.6093, 0.6234, 0.8609. The second block uses the same seed but splits the draw into runif(2) then runif(3), returning the same five values in the same order, showing that the sequence is preserved regardless of how many numbers are drawn at a time.

Table of contents for the post:

Introduction
Seeds and reproducible randomness
My (somewhat incorrect) mental model of how seeds work
Making “random” numbers with an equation
    Live interactive playground
    Cycles and fancier algorithms
Why does it matter if “random” numbers aren’t actually random?
    You’re limiting yourself to narrow, known universes
    You can seed hack and get any values you want
    Real world bad things can happen because of pseudorandom numbers
Can computers even create true randomness?
    Moving a mouse around
    Lava lamps
    Atmospheric noise
How I use true randomness in my own work
“…as an ook cometh of a litel spyr…”

Table of contents for the post: Introduction Seeds and reproducible randomness My (somewhat incorrect) mental model of how seeds work Making “random” numbers with an equation Live interactive playground Cycles and fancier algorithms Why does it matter if “random” numbers aren’t actually random? You’re limiting yourself to narrow, known universes You can seed hack and get any values you want Real world bad things can happen because of pseudorandom numbers Can computers even create true randomness? Moving a mouse around Lava lamps Atmospheric noise How I use true randomness in my own work “…as an ook cometh of a litel spyr…”

I've been using random seeds for years but I have no idea how they work. Seeds somehow(?) make the same random numbers?

So I figured it out! New post includes an interactive PRNG generator, lava lamps, lottery fraud, @random.org, Chaucer, and Minecraft #rstats

www.andrewheiss.com/blog/2026/04...

1 week ago 100 23 6 3

Utrecht University is hiring an Assistant Professor in Statistics and Social Data Science - great opportunity for scholars working at the intersection of advanced statistical methods and computational social science research. #AcademicJobs #ComputationalSocialScience

uu.nl/en/organisation/worki...

1 week ago 16 29 0 0
Introduction to gglite A visualization in gglite is built by composing independent layers:

Guy I’m sponsoring very quietly releases #rstats package I’ve been wanting for the last couple of years.

pkg.yihui.org/gglite/doc/g...

Guess he shut up and took my money. 🤣🤣🤣

1 week ago 99 21 3 5
Post image Post image

Celebrating the start of the new teaching semester at the Universität Rostock by providing Open Access to materials of two of my previous courses

1) Introduction to Computational Social Science:
github.com/akbaritabar/...
2) Computational Approaches to Migration Research:
github.com/akbaritabar/...

1 week ago 18 4 2 1
Advertisement
Post image

Hi Bluesky!

I'm excited to share my job market paper (for the 2025-26 market)!

It introduces a new extension of RDD where outcomes are entire distributions: Regression Discontinuity Design with Distributions (R3D).

Thread below 👇 (1/)

1 year ago 42 16 2 4
Post image

Our new article on socioeconomic disadvantage, mental distress & functional impairment is out #OpenAccess in American Psychologist.

Led by the brilliant @emkbridger.bsky.social, with J. Maltby, @eikofried.bsky.social, co-supervised by me & @ludvigdb.bsky.social.

🧵

psycnet.apa.org/fulltext/202...

1 week ago 32 12 2 1