I have a new blog post out today that I'm really excited about. I walk through how you can use Gradient Boosting to fit entire vectors of parameters for each observation, not just a single prediction. statmills.com/2026-04-06-g... #pydata #rstats
Posts by Alexander Severinsen
#rstats #dataviz
A new kid on the dataviz block is the predictions plot, showing how predictors in lm()/glm() contribute to
the predicted response. A novel and useful idea!
TAS: doi.org/10.1080/0003...
Implemented in the {classmap} package.
My #rstats #shiny app is finally online. Take a look at fjell.dk (domain name to prevent traffic 😅). My private ski trip planning tool with 2D and 3D maps #mapgl with weather history and forecasts, route planning and map downloading for offline use #mbtiles
Yes, quite surprised and something I will likely remember for a while.
Future self - how to corrupt #sqlite
Mount .db, -wal and -shm as separate files. SQLite recreates the WAL on checkpoint - new inode. #docker still points to the old deleted one. Container & host silently drift apart.
Result: corrupted database.
Fix: - /host/db:/container/db
No fix: stress level
I rounded up a few Claude Skills for #RStats users.
Huge thanks to the creators who developed them. They share Skills for everything from tidyverse code to brand.yml files to learning while using AI.
Hope the list is useful, and please let me know what I missed! 🧡
rworks.dev/posts/claude...
#rstats #mapgl #shiny
Just finished integrating mapbox into my R Shiny app using the mapgl R package, including a steepness layer.That library is a true find! Picture from mountains in Lyngen where I should be more often rather than in front of a computer 😅
#rstats
A single line of code that made my day! Just added FROM rocker/r2u:jammy to my Dockerfile and my image that used to take 90 minutes to build took 1 minute 🤯❤️
Simulated null distribution for data with a sample size of 100, difference in group means of 5, and a p-value of 0.142
Simulated null distribution of a slope of 0.8 and p-value of 0.002
Finally, we have to decide if the p-value meets an evidentiary standard or threshold that would provide us with enough evidence that we aren’t in the null world (or, in more statsy terms, enough evidence to reject the null hypothesis). There are lots of possible thresholds. By convention, most people use a threshold (often shortened to α) of 0.05, or 5%. But that’s not required! You could have a lower standard with an α of 0.1 (10%), or a higher standard with an α of 0.01 (1%). Statistically significant The p-value is < 0.001 and our threshold for α is 0.05 In a world where there is no relationship between x and y, the probability of seeing a slope of at least 0.901 is < 0.1% Since < 0.001 is less than 0.05, we have enough evidence to say that the slope is statistically significant.
Evidentiary standards When thinking about p-values and thresholds, I like to imagine myself as a judge or a member of a jury. Many legal systems around the world have formal evidentiary thresholds or standards of proof. If prosecutors provide evidence that meets a threshold (i.e. goes beyond a reasonable doubt, or shows evidence on a balance of probabilities), the judge or jury can rule guilty. If there’s not enough evidence to clear the standard or threshold, the judge or jury has to rule not guilty. With p-values: If the probability of seeing an effect or difference (or δ) in a null world is less than 5% (or whatever the threshold is), we rule it statistically significant and say that the difference does not fit in that world. We’re pretty confident that it’s not zero. If the p-value is larger than the threshold, we do not have enough evidence to claim that δ doesn’t come from a world of where there’s no difference. We don’t know if it’s not zero. Importantly, if the difference is not significant, that does not mean that there is no difference. It just means that we can’t detect one if there is. If a prosecutor doesn’t provide sufficient evidence to clear a standard or threshold, it does not mean that the defendant didn’t do whatever they’re charged with†—it means that the judge or jury can’t detect guilt.
I just whipped up this little #QuartoPub site last week that demonstrates how I teach p-values/hyp-testing through simulation both with live OJS and with #rstats, and I think it's super neat! It has examples for diff-in-means, diff-in-props, and regression slopes nullworlds.andrewheiss.com #statsky
nanonext 1.8.0 is out - R now has a streaming HTTP/WebSocket server with bundled TLS.
Runs alongside Shiny in the same process. We're already using it at Posit to explore new real-time capabilities.
#Rstats #tidyverse
tidyverse.org/blog/2026/02/nanonext-1-8-0/
My first go at Claude Code. Quite scary and quite useful! Any good #rstats advice for getting along with RStudio?
#statstab #466 {grateful} Facilitate citation of R packages
Thoughts: Great little package to easily cite all the packages you use in a script. (doesn't cite itself unless you ask it)
#rstats #r #packages #acknowledgement #credit #quarto
pakillo.github.io/grateful/ind...
Finished teaching my new Advance Stats for Psych graduate course today with a heavy emphasis on both DAGs and shifting away from coefficient interpretation and towards models as prediction machines.
Both went great!
The latter was extremely helpful for logistic regression (for obvious reasons 😵💫)!
rspatialdata: a repository of spatial datasets & tutorials for spatial analysis & visualization in #rstats, supporting real-world applications such as estimating air pollution, quantifying disease burden, and monitoring progress toward the SDGs🌍💻📊
👉 rspatialdata.github.io
New Year, New Colour Tool
for you data visualizers and maybe the odd designer
obumbratta.com/colour
We just read a ~180 million row dataset (from disk!) and did a group-by aggregation on it. In < 1 second. On a laptop.
Benchmark plot showing minimal data I/O cost of duckdb and polars relative to other options (alongside very fast compute time)
A few months ago, I gave a workshop on “(Pretty) big data wrangling with DuckDB and Polars”.
Slides, notebooks etc. are all available here: grantmcdermott.com/duckdb-polars/
#EconSky
A new resource presents an open per-building dataset of rooftop solar PV potential in the EU, finding that potential capacity could cover around 40% of electricity demand in a 100% renewable 2050 scenario.
Soooo if you use #RStats and Claude Code:
R console: install.packages("btw")
Terminal: claude mcp add -s "user" r-btw -- Rscript -e "btw::btw_mcp_server()"
And now Claude Code can answer questions about ANY R package installed on your system.
Introducing gdalcli by Andrew Brown -- an R frontend to GDAL’s unified CLI (≥3.11) 🌐
Compose and execute GDAL workflows with pipe-friendly functions.
Learn more: github.com/brownag/gdal...
#RStats #GDAL #Geospatial #OpenSource #RSpatial
🚀 Watch the Earth Change: New QGIS Plugin Creates Satellite Timelapse Animations in Seconds 🌍
A QGIS plugin for creating timelapse animations from satellite and aerial imagery using Google Earth Engine. Supports NAIP, Landsat, Sentinel-2, Sentinel-1, MODIS NDVI, and GOES weather satellite imagery.
My favorite #Python package to use is spopt, a library for spatial optimization.
It helps you with:
📊 Facility location planning;
📊 Sales territory design;
📊 Maximizing market share;
And much more! Check it out here:
pysal.org/spopt/
Working with big spatial data sets in #rstats? You should try {duckspatial}. The dev version of #duckspatial (soon on CRAN) uses #duckdb to perform super fast and memory efficient spatial operations cidree.github.io/duckspatial/...
In a benchmark against, {sf}....
New Post on @ropensci.org: Better #RStats Code, Without Any Effort, Without Even AI
Edited by @etiennebacher.bsky.social & Steffi LaZerte
Read about:
✨ {lintr} for detecting lints
✨ Air for formatting code
✨ jarl for detecting+fixing lints
✨ {flir} for refactoring
ropensci.org/blog/2025/12...
Creating polished PDFs from Quarto can be challenging.
At @claritydatastudio.com, we now use Typst for high-quality, branded reports.
@joseph-barbier.bsky.social and I have created a detailed walkthrough to show you how we do it.
Learn more: buff.ly/hRuUEqR
#rstats
library(dplyr) library(gm) golden <- tribble( ~pitches, ~duration, ~lyric, "E4", "q", "I’m", "G4", "q", "done", "C5", "q", "hid-", "B4", "q", "-ing", "D4", "q", "now", "G4", "q", "I’m", "E5", "q", "shin-", "D5", "q", "-ing", "G4", "q", "like", "B4", "q", "I’m", "A5", "q", "born", "F#5", "q/3*(q/8)", "to", "G5", "q/3", "be._", "G5", "h", "" ) music <- Music() + Key(1) + Tempo(125) + Meter(2, 4) + Line(pitches = golden$pitches, durations = golden$duration) + Tie(13) + Lyric(golden$lyric[1], 1) + Lyric(golden$lyric[2], 2) + Lyric(golden$lyric[3], 3) + Lyric(golden$lyric[4], 4) + Lyric(golden$lyric[5], 5) + Lyric(golden$lyric[6], 6) + Lyric(golden$lyric[7], 7) + Lyric(golden$lyric[8], 8) + Lyric(golden$lyric[9], 9) + Lyric(golden$lyric[10], 10) + Lyric(golden$lyric[11], 11) + Lyric(golden$lyric[12], 12) + Lyric(golden$lyric[13], 13) show(music)
The first part of the bridge from "Golden" from KPop Demon Hunters
Just discovered the {gm} package, which lets you programmatically create sheet music (and audio!) with #rstats (with MuseScore as the backend) flujoo.github.io/gm/index.html
The @maprva.org surveillance map, ported to #Rstats using mapgl, osmdata, sf, and dplyr.
gist.github.com/mhpob/17782b...
Not 1:1 in terms of Ultra/Mapbox GL JS -> R, but pretty close!
Original query: overpass-ultra.us#query=url%3A...
cc @mackaszechno.bsky.social @kylewalker.bsky.social
Are you an RStudio user thinking about trying Positron? We put together some resources to help with the transition.
Learn how to import your keybindings and handle projects while gaining a more flexible environment for both #RStats and Python.
Check out the guide: positron.posit.co/migrate-rstu...
Positron Assistant: GitHub Copilot and Claude-Powered Agentic Coding in R blog.stephenturner.us/p/positron-a...
#rstats
Positron connected to a not-really-remote remote session inside a Docker container
*Another* blog post about @posit.co's Positron! Its Remote Explorer feature lets you connect to other computers via SSH, including locally-running Docker containers, which means you can write and run code in version-locked #rstats environments! www.andrewheiss.com/blog/2025/07...