We're slowing adding features so that you might consider stepping away from powerpoint.
Remember: this can connect to your postgres/bigquery/databricks datasets *and* you have sliders/widgets.
Posts by Raúl Peralta Lozada
A Unified Dashboard and Orchestrator for Quality Checks¶ Unit tests · Data validation · Linting · Spelling Run every quality check on your project using a single command: unit tests, data validation, linters, spell checkers. Scrutin watches for edits, figures out which checks are affected, and re-runs them in parallel. Drill into a failure to see the expected and actual values, as well as the relevant source code. Use quick keystrokes to fix linting and spelling issues, or to open files in your editor of choice.
🚨 #RStats and #PyData devs!
I'm looking for β testers for this thing I just built: A unified dashboard + orchestrator for code and data quality checks.
It has lots of neat features and I'm super eager for feedback and bug reports.
Check out the video demo:
vincentarelbundock.github.io/scrutin/
The great @eddelbuettel.com invited me to his STAT447 class at the University of Illinois.
If you'd like to hear me speak about the interpretation of statistical models in #RStats, using the {marginaleffects} 📦, check out the video!
www.youtube.com/watch?v=v3TX...
sbi v.0.26.1 is out 🎉. We initially planned this release for January, but then the Grenoble Hackathon and GSoC applications happened. Now we have three new methods, better neural nets, cleaner internals, better docs, and 9 new contributors 🤗.
Highlights below 🧵
Demo of narwhals.corr
✨ New Narwhals expression: `nw.corr`
👥 Supported in both group-by and window contexts
🌐 Polars, DuckDB, PyArrow, PySpark, pandas...all supported
🗒️ PS. please don't draw conclusions about ice cream sales and drownings from this post
cover of the book "Bayesian Workflow" by Gelman, Vehtari, et al. Coming out later this year, in the summer probably.
I would have preferred to have the "draw the rest of the owl" meme on the cover, but this will do. Seems like it is on schedule, and we'll leave some typos so you know we didn't write it with AI.
ArviZ now has built-in tools for prior & likelihood sensitivity analysis via power-scaling!
Instead of fitting multiple models with different priors, you fit once and use importance sampling to approximate the effect of perturbing the prior or likelihood.
EVoC is a library designed specifically for fast clustering of high dimensional embedding vectors. It can produce high quality clusters extremely efficiently, and requires little to no hyperparameter tuning.
Better clustering than UMAP + HDBSCAN; faster clustering than KMeans.
Run marimo inside JupyterHub using the marimo Jupyter extension:
www.youtube.com/watch?v=fnB...
New skrub release ✨️
I'am really excited about the more general ApplyToCols.
I've found that it enables me to write very naturally complex data transformations on dataframes, as I combine it with skrub's selectors to choose which columns I apply transformations on.
skrub-data.org/stable/refer...
Statistical Rethinking 2026 is done: 20 new lectures emphasizing logical and critical statistical workflow, from basics of probability theory to causal inference to reliable computation to sensitivity. It's all free, made just for you. Lecture list and links: github.com/rmcelreath/s...
ArviZ 1.0 is out! We have refactored it to be more modular, flexible & lightweight. For an overview of the changes, check the migration guide. python.arviz.org/en/stable/us...
In October, I gave a talk at ML in PL in Warsaw: a whirlwind tour of what goes into training image and video generation models at scale.
📺 video: www.youtube.com/watch?v=qFIT...
🖼️ slides: docs.google.com/presentation...
Technical difficulties: new stream here youtube.com/live/2lcCqjM...
I have added a new tutorial on discrete diffusion models:
github.com/gpeyre/ot4ml
My talk on "Inference for group interaction experiments" from the Foundations of Causal Inference workshop at the Isaac Newton Institute is available via their Youtube channel: youtu.be/3hh-bM8YNSc?...
It's been hard (it's being hard?) to learn how to make human connections. It's still a learnable skill.
We just released Polars 1.37, here are the highlights:
Improved Streaming Sinks: 1.14x-1.88x speedup, ~10% of the original memory.
Streaming Compressed CSVs
Faster SQL Ordering
pl.PartitionBy
min_by / max_by (see below)
Series.sql()
Free-Threading Support
Python 3.9 Support Dropped
musl Builds
@anthropic.com is investing $1.5 million in the PSF, focused on security. These funds will make an enormous impact on the PSF and the security of millions of #Python and @pypi.org users. Please join us in thanking Anthropic for this landmark gift!
Read more on our blog:
This is a good thread. My not particularly hot take is that there is no causal inference, there is only predictive inference and CI is mostly a correction away from the bad NP-style “a p-value tells me if moving x by one unit has a significant effect” thinking. But like you don’t need DAGs for that.
A satisfying read! I must concede that I am fairly ignorant of the broader causal literature, and am a priori particularly suggestible to the claims herein.
arxiv.org/abs/2512.23408
'Probabilistic Modelling is Sufficient for Causal Inference'
- Bruno Mlodozeniec, David Krueger, Richard E. Turner
I’m excited that MIT News covered our new paper on confidence intervals for associations in spatial settings!
news.mit.edu/2025/new-met...
A new version of scikit-learn has been released 🥳 check out the highlights: scikit-learn.org/stable/auto_...
Thanks everyone who contributed to this release!
Let me know what you think of the experimental GPU support
course schedule as a table. Available at the link in the post.
I'm teaching Statistical Rethinking again starting Jan 2026. This time with live lectures, divided into Beginner and Experienced sections. Will be a lot more work for me, but I hope much better for students.
I will record lectures & all will be found at this link: github.com/rmcelreath/s...
Only a few more days to register for my charity regression course on Wednesday. All material, including slides and recordings, will be made available for those who cannot attend live. A few sponsored registrations still available. Registration details at betanalpha.github.io/courses/.
What do you consider lacking in JAX compared to PyTorch?
Brmspy: Python-first access to brms (cmdstanr backend, ArviZ output) by Braffolk discourse.mc-stan.org/t/brmspy-pyt...
pandas 3.0 rc demo
😱🙀 The pandas 3.0 release-candidate is here!
💥 Will it break your code?
💡 Test it with `uv pip install -U --pre pandas` to find out!
🌊🦄 Narwhals users can relax, everything's taken care of for you, no need to do anything ☺️
Oh boy, you can bet we are cooking the coolest profiler for Python 3.15 👨🍳🔥