Our paper is now out in Hormones and Behaviour.
We found little evidence for group differences in 2D:4D ratios by sexual orientation when jointly modeling publication bias and heterogeneity. w/ @fbartos.bsky.social, Ben Jones & @tvpollet.bsky.social
www.sciencedirect.com/science/arti...
🧵1/9
Posts by František Bartoš
Interested in finding out more about @jaspstats.bsky.social? The Psych Methods Group at the University of Amsterdam is running four hands-on workshops (in person or online) this summer (@ejwagenmakers.bsky.social, @fbartos.bsky.social). More information at jasp-stats.org/workshops/ but to sum up:
Everything is ready for the Perspectives on Scientific Error conference that starts tomorrow in Leiden! I look forward to hanging out with the mix of metascientists, philosophers of science, and statisticians! So many old friends will be there (and hopefully some new ones)! #PSE8
Come to Amsterdam or join online for the full week of JASP workshops (24th-28th of August)! If you can't do the full week or you are only interested in meta-analysis, I will be giving the Meta-Analysis workshop on 25th of August.
jasp-stats.org/2026/02/05/h...
Diagram showing four phases of methodological research (Theory, Exploration, Systematic Comparison, Evidence Synthesis) with an arrow indicating that preregistration usefulness increases from early to late phases. Each phase lists its aim, elements, outcome, and an example from factor retention research.
Does it make sense to preregister simulation studies?
This question has sparked a lot of debate.
▶️We* work through the why, when, and how
▶️We discuss different phases of methodological research to clarify where preregistration might (or might not) add value
📝 Preprint: doi.org/10.31234/osf...
Does it mean that AI/LLMs do not help at education? I personally don't think so. I'm using the AI every day and find it incredibly useful. It would be odd if they didn't help at learning at all. However, the current empirical base does not substantiate strong claims.
Meta-analysis level re-analysis then further highlights the issue of publication bias. Extremely overstated evidence (left) and mean effect size estimates (middle) due to a large degree of publication bias (right).
We explored several moderators and compared results of studies published before and after 2023 (to assess older AI systems and modern LLMs) but we did not find any meaningful difference.
Publication bias-adjusted estimates decrease the average effect from d = 0.63 to d = 0.20. More importantly, the between-study heterogeneity is so large that the distribution of effects can range from -1.52 to 1.91! This is a ridiculous variance making the mean meaningless.
We managed to collect 1,840 effect size estimates from 67 meta-analyses. The distribution of study-level effect sizes shows both a notable skew (funnel plot on the left) and clear selection for positive effects (z-curve plots on the right).
We recently criticized one meta-analysis on the effect if ChatGPT on learning for failing to adjust for publication bias (bsky.app/profile/fbar...). In a response, the original authors argued that many other meta-analyses find the same effects. So we examined them all.
We just posted a preprint with a comprehensive meta-meta-analysis of the effects of AI/LLMs on learning.
TLDR:
- 1,840 effect sizes
- extreme between-study heterogeneity
- extreme publication bias
- small average effects (three times lower than usually reported)
(osf.io/preprints/ps...)
Edgeworth proposed the alpha=.005 criterion 134 years prior to Benjamin et al. (2019). :-)
www.bayesianspectacles.org/redefine-sta...
The world's earliest recorded outlier? Let me know if you have an even older example!
www.jasp-services.com/jasp-for-qua...
Surprisingly never in the case of publication bias tests :D
"we did not find any evidence for publication bias (p=0.077)"
This is also likely to be the last update of this version of the package. Next year, I will introduce breaking changes to the interface with the 4.0 major release, which will make the interface much more similar to metafor.
As such, it provides an easy-to-apply state-of-the-art Bayesian meta-analytic methodology for most meta-analytic settings!
See an overview of the current functionality with a brief description of all vignettes fbartos.github.io/RoBMA/articl...
The Robust Bayesian Meta-Analysis package got updated with additional vignettes explaining how to perform Bayesian model-averaged publication bias-adjusted
- multilevel meta-analysis (cran.r-project.org/web/packages...)
- multilevel meta-regression (cran.r-project.org/web/packages...)
I recently bought 20 bags of potato that, according to the Albert Heijn supermarket, should each contain 1 kg. This turns out to be *false*.
www.jasp-services.com/at-the-alber...
Yep, its ridiculous. Those studies should not be published...
Extracting the study-level data from existing meta-analyses is quite feasible, so, there is almost no excuse not to do so.
Also, you cannot really evaluate between-study heterogeneity, see e.g. our latest study-level meta-meta-analysis that shows the limitations of the previous meta-analysis-level meta-meta-analysis doi.org/10.31234/osf...
My main worry is that they might have synthesized the meta-analytic estimates rather than the study-level estimates? The manuscript wasn't super clear on that and the OSF had only meta-analysis level data?
If so, that makes the publication bias adjustment ineffective...
The suspense is building: do the measurements of 20 units indicate that the Albert Heijn underfills its 1 kg bags of potatoes? An interim post on the importance of articulating your predictions *before* seeing the results. :-)
www.jasp-services.com/do-the-1kg-a...
This week's blog post features "raincloud plots", a relatively recent development in data visualization.
Will the raincloud plot gradually replace the box plot? It just might!
Check out the raincloud plot for the planets in our solar system at
www.jasp-services.com/jasp-for-qua...
Also, this should not be a reason to stop exercising.
1) There are other benefits of exercise
2) Some populations/exercises show benefit
3) There might be wider effects on cognition; however, the literature is too heterogeneous and contaminated with publication bias to be certain
I think that the field needs to clean up the published literature a bit. Additional small studies are not going to move the needle at this point; maybe a couple of large-scale, pre-registered studies might provide more insight?
We also re-analyzed all of the original meta-analyses individually. Many of them are consistent with publication bias: the evidence for and the degree of the pooled effects decrease once publication bias is adjusted for.
We run subgroup analyses for each outcome/population/intervention. We found that most results are too heterogeneous to tell (see wide prediction intervals), but some interventions seem to be promising and some have substantive evidence against them. See figures for each outcome.
First, we found notable publication bias, especially in studies on general cognition and executive function. Importantly, there was extreme between-study heterogeneity (tau ~ 0.3-0.6!). This means that the results were consistent with both large benefit but also large harm.