Aurélien Allard (@aurelienallard) Bsky

Quote from Terence Tao: "In any case, I would indeed say that this is a situation in which the AI-generated paper inadvertently highlighted a tighter connection between two areas of mathematics (in this case, the anatomy of integers and the theory of Markov processes) than had previously been made explicit in the literature (though there were hints and precursors scattered therein which one can see in retrospect). That would be a meaningful contribution to the anatomy of integers that goes well beyond the solution of this particular Erdos problem."

The interesting thing is that this wasn't "low-hanging fruit", it was something that experts had worked on and gotten partial results in. The solution was achieved by taking a different approach than previously, something that will be useful beyond the particular problem.

4 days ago 6 1 1 0

@ruben.the100.ci actually wrote this blogpost, not Julia Rohrer ;) But both are indeed great!

1 week ago 2 0 1 0

Popper's manifest truth doctrine is still the best on this, imo

1 week ago 2 1 0 0

Topline results of daraxonrasib's ph3 trial for pancreatic cancer are so incredible I can hardly believe them.

Median overall survival of 13.2 vs 6.7 months with chemo, i.e. ~doubling survival.

I'd like to see the data but this may be the biggest breakthrough in pancreatic cancer we've ever seen.

1 week ago 88 16 6 1

Neighbourhood Cross Validation (or NCV), is a model-fitting criteria designed to reduce overfitting in dependent data sets by evaluating how well a model predicts sets of observations ("blocks") when some relevant neighbourhood ("NB") of observations around each block is excluded 2/n

1 week ago 3 1 1 0

This was a fun exercise. For the non-philosophers, here's a thread briefly describing the ridiculous views that we philosophers are pretty sure we could find seven experts to agree to.

If you read one of them and think "What?!? I must be misunderstanding" you probably aren't.

1 week ago 85 28 8 12

1/ New paper out: Nicolas Say, Lucie Vrbová, and I tested whether a very simple nudge can improve the forecasting performance of business students. The prompt asked students to consider multiple future scenarios and credible evidence sources before making predictions.

1 week ago 5 3 1 1

Predicting relationship quality with itself? A single general factor captures most of the variance across 34 common relationship measures In relationship science, researchers have generated a wide array of constructs and corresponding self-report measures to characterize, explain, and predict relationship quality – the foremost studied ...

New paper, out this week in PLOS One, suggests that most close relationship self-report measures are primarily capturing relationship quality 🧵
journals.plos.org/plosone/arti...

2 weeks ago 75 35 3 4

Big Team Science version of no real transfer of effects

2 weeks ago 6 2 0 0

One of my favorites paper got published 🤓 It covers a lot of ground and it’s the best summary of my views on misinformation and what to do about it. Give it a read :)

🔓 osf.io/preprints/ps...
👉 doi.org/10.1177/1461...

2 weeks ago 126 45 1 6

Gah, apparently for the past two weeks all my past blog posts got overwritten by AI-generated summaries of them. Thanks for nothing, Claude.

2 weeks ago 9 1 1 0

Very useful, and should stem the tide of uncritical citations of studies that have failed to replicate, while boosting those that have replicated.

2 weeks ago 3 1 0 0

Can confirm. We failed to replicate that one.

The pancakes, however, were delicious.

2 weeks ago 37 6 1 0

Line chart of the estimated share of newborns who die before reaching the age of five from 1950 to 2023 where child mortality in China spikes to about 1 in 3 children during the Great Leap Forward (1958 to 1962), producing a noticeable uptick in global rates. After the 1960s both China and world rates decline steadily to low single digits by 2023.

China’s Great Leap Forward caused a dramatic spike in child deaths—

Child mortality rates in China have fallen from more than 20% in 1950 to less than 1% today.

But this steady progress was interrupted in the late 1950s during the “Great Leap Forward”.

2 weeks ago 67 19 1 2

Had an agent just start writing a CSV by hand today (unclear what relation the data had to reality) when the function it wrote didn’t work correctly in its environment. Stay safe out there!

2 weeks ago 5 1 1 0

Offering scientists cash to spot errors in published papers doesn’t work The ERROR project tried enticing reviewers with payments. Now it’s launching a journal—and promising papers as rewards

A project offering researchers up to 3,500 Swiss francs for finding errors in published academic papers is having trouble finding people to do the work.

My latest for @science.org:

www.science.org/content/arti...

@erichehman.bsky.social, @malte.the100.ci, @ruben.the100.ci, @conjugateprior.org

2 weeks ago 40 10 1 1

In 2026 there's little reason for philosophy degrees to require classical logic.

Sure, it shows up in the lit, but so do Bayes and stats in nearly equal numbers.

It's also not clear it improves students' domain general reasoning (both anecdotally and in experimental studies)

2 weeks ago 25 5 19 6

I echo Melissa that working in COS’s metascience team was transformational for me, in all the ways she describes so eloquently, but also personally. The team was so kind, helpful, and fun to work with, it set the bar very high. And starting my nonbinary journey was easy in such a queer team 🤍

2 weeks ago 26 4 1 1

Reproducibility and robustness of economics and political science research - Nature Robustness checks and reproduction of analyses with existing and updated data based on 110 articles in economics and political science journals with data and code-sharing requirements found high level...

Cool results from Brodeur et al

doi.org/10.1038/s415...

"Our results are in stark contrast with several studies documenting low computational reproducibility rates. This is perhaps unsurprising given that most of the articles in our sample were already computationally reproduced by data editors".

2 weeks ago 5 2 1 0

Beyond the obvious China story, the EU's catch-up over the past few decades is actually pretty remarkable. Especially considering most top journals are still edited in the U.S. (at least as far as I can tell for the social sciences). www.nber.org/digest/20260...

2 weeks ago 19 7 0 0

In case you've seen results going around today about a paper on reproducibility in the social sciences in Nature and it seems like reactions differ, you could be forgiven for being confused because there are actually *FOUR* such studies all published the same day, with somewhat differing results.

2 weeks ago 91 20 5 3

Reliable research in the social and behavioural and sciences Sweeping new investigations probe the replication, robustness and reproducibility of results across the behavioural and social sciences.

A new set of papers, sharing the long-awaited result of several reproducibility and replicability projects, including commentaries, is published today. I look forward to reading the studies, and re-using the data generated! www.nature.com/collections/...

2 weeks ago 29 11 0 0

Half of social-science studies fail replication test in years-long project Results from massive, ‘eagerly awaited’ initiative reinforce concerns about the credibility of science — but raise hope for solutions.

A massive seven-year project exploring 3,900 social-science papers has ended with a disturbing finding

go.nature.com/4bZ9k0W

2 weeks ago 88 40 0 25

6/ We also find:
Coding errors in ~25% of papers. Major coding errors in about 10% of studies, ranging from duplicates to conducting a simple difference instead of a difference-in-differences.

2 weeks ago 18 2 1 0

🧵1/ Our first meta-science paper (with 350+ coauthors) is published today in Nature. It presents one of the largest-ever reproducibility projects in economics & political science.

Here’s what we found 👇

2 weeks ago 166 89 2 21

SCORE | Center for Open Science SCORE shows that there is no shortcut to producing credible research findings, and there is no single indicator of trustworthiness. Research progress depends on transparency, rigor, and establishing r...

SCORE, a collaboration of 865 researchers, is now released as three papers in Nature, six preprints, and a lot of data (cos.io/score/). SCORE examined repeatability of findings from the social-behavioral sciences and tested whether human and automated methods could predict replicability.

2 weeks ago 190 106 1 32

Reliable research in the social and behavioural and sciences Sweeping new investigations probe the replication, robustness and reproducibility of results across the behavioural and social sciences.

Today, the SCORE program releases its primary results! 865 researchers examined research credibility across the social and behavioral sciences, publishing three papers in Nature + five preprints.

📑 Explore the papers: www.nature.com/colle...
ℹ️ Read more about SCORE: www.cos.io/score

2 weeks ago 62 43 3 8

📄Published Today in Nature:

500 researchers reproduced 100 studies across the social & behavioral sciences to assess their analytical robustness (led by @balazsaczel.bsky.social & @szaszibarnabas.bsky.social).

Article: www.nature.com/articles/s41...

Preprint: osf.io/preprints/me...

TLDR: 1/11

2 weeks ago 91 48 2 4

Hallucinated citations are polluting the scientific literature. What can be done? Tens of thousands of publications from 2025 might include invalid references generated by AI, a Nature analysis suggests.

The question about fake references should NOT be “What can be done?”

We KNOW what can be done: human peer reviewers and editors check the damn references individually, by hand.

The question should be, “Why are so many publishers NOT doing those basic checks?”

www.nature.com/articles/d41...

2 weeks ago 82 26 3 4

a man holding a microphone with the words i 'm actively crying below him ALT: a man holding a microphone with the words i 'm actively crying below him

Reviewer feedback re: listing the implications of our study in (a), (b), (c), format

"The manuscript would benefit from a more cohesive writing style and less of the "AI like" list structure"

I was very much hoping my mid-tier writing would help protect me against AI slander, sad day in Danville

2 weeks ago 27 4 6 0

Posts by Aurélien Allard