Flying halfway round the world to spend 15 minutes failing to show 47 slides to four people, one of whom will respond by talking off topic at length while everybody sits there hoping someone else will tell him to stop, is a rational use of time and resources
Posts by Matt Cooper
Sample size calculations - that rush you get when someone comes back 18 months after you did a calculation for them for some "minor tweaks" and you find your documented notes and calculations code and screenshots ๐
#RStats #Statistics #Statssky #DataScience
๐จ New blog post - API access to UN Population Projections (and other interesting data!)
In this example we just extract and plot a couple of interesting variables (Youth and Older-age dependency ratios) as an introduction to using the (well, their) API.
#RStats #Statistics #Statssky #DataScience
Chart showing: WA drivers and gamblers pay more tax than gas companies pay royalties: 2025-26 Rego: $1,520m Vehicle licence duty: $678m Gambling taxes: $403m Oil and gas royalties: $385m
You often hear from gas companies that "yeah but we pay state royalties!!!"
Lol nope.
WA drivers pay more in rego and licence fees than gas companies pay WA in royalties.
Heck WA gamblers pay more tax. And that's incredible given WA does NOT have pokies in pubs and clubs
Chart showing Profit and wages of oil and gas, compared to administration and support Admin and support: Wages $64.3bn; Profits $13.1bn Oil and gas: Wages $4bn; Profits $71.6bn
You'll often hear how the gas industry is vital for Australia because it's so big. But really the only thing big about it is its profits.
Like Wordle? How about some math based alternatives!
Created by a friend of mine, four different games, my favourite is FloodFill. Have a go!
www.mathislit.com/FloodFill/
๐๐ We're hiring ๐๐
Open to considering all levels of experience for this Biostatistician position in Perth WA - one of the most ๐ cities on ๐
www.seek.com.au/job/90579895
#RStats #RStudio #StatsEpi #Stats
๐จ New blog post - Simplifying the path to tidy variable names with nice labels
the-kids-biostats.github.io/posts/2026-0...
Little by little we're trying to remove friction from OUR regular tasks/processes. We hope others might benefit from this along the way.
I've wished for this package so many times!
Congrats Matt, nice!
See also github.com/The-Kids-Bio...
@mattansb.msbstats.info you can play with (but not judge ๐คฃ) our function here github.com/The-Kids-Bio... if you like
We've leaned heavily into gt and flextable. We found gt gives us the 'within R' flexibility we needed %>% and then a final processing through flextable gives us the formatting (html and word) polish and flexibility we wanted.
THE best #1.
Get it on repeat ๐
Line chart showing percent correct on the y-axis and three conditions on the x-axis: Baseline, Intuitive, and Mocked. Three lines represent GPT-5.2, Claude Opus 4.5, and Gemini 2.5 Pro. All three models score between 93-98% on baseline, then drop on intuitive and mocked conditions. All three perform the worst on the mocked condition.
More on LLMs and plot interpretation: they do fine in normal conditions, but struggle when the plot conflicts strongly with their priors.
@simonpcouch.com and I investigated why and what might help: posit.co/blog/llm-plo...
No one says to legit business owners, "hey I see there's an underground market selling toxic knock-offs of your product: how will you incorporate the toxic product AND protect your customers against its harms?" But apparently it's fine to make this suggestion to educators with respect to AI.
Wowzer!
Not gonna lie, I've been straight up trying this every 6 months for about 15 years just waiting for this to drop!
๐จ New blog post
the-kids-biostats.github.io/posts/2025-1...
This one looking at the use of the {readaihw} package and the value of the AIHW data it enables you to access, plus a peak behind the scenes of something we do to spur on a little water cooler talk in the office.
Here's to World Statistics Day ๐
Yes that's right, we've already got our eye on wrapping up 2025.
#RStats #Statistics #Statssky #DataScience
Happy birthday to me!
Let's go @chargers.bsky.social
โกโกโก
Listen. Greg. We need evidence-based data-driven decision making, just not when it comes to the economy okay mate.
๐๐
Both are 'actually' ordered, just the latter has the attribute of being 'ordered'.
Thanks. Yeah I think the problem with the wording is the "preserves the existing status" part! Because the words 'preserves' and 'existing' both heavily imply 'nothing with will done'.
So = NA will give B A C (if that is the frequency order), = T will give B < A < C (ordered).
Someone help my brain without needing more coffees.
ordered - A logical which determines the "ordered" status of the output factor. NA preserves the existing status of the factor.
I don't understand the roll of this argument, or (if that argument does what it says) why this solution works?
The phones we have, that have been in development for decades, are buggy enough.
Can you imagine doing Microsoft 2FA six times for something 20mm from your eyeball?
Nothing a bayesian analysis can't fix.
advertised CV, which may make some employers think that you'll always be (actively?) looking for other opportunities. Especially in scenarios where the work shown doesn't completely align with the job being applied for.
"This person looks good, but it seems they're more interested in X"
As it relates to "Was it useful in securing your first data job?"
For some job applications it could help, for others, it may give off a conflicting message. Like "does this person want the job I'm offering or some other more independent consulting job". It can be seen as a permanently ...
Nice.
Myself, I'm particularly fond of the:
1. M365 login page
2. Enter work email
3. Taking you to your work place sign in page
4. Enter password
5. Type the number XX into your authenticator app
6. Refreshes
7. "You've successfully signed out"
here's the first hint
(am now reflecting on having shared this internationally, intuition on this might be different depending on... hemisphere)