Advertisement · 728 × 90

Posts by Paul Clist

OSF

New preprint out today (osf.io/preprints/ps...). We tested whether AI agents are actually infiltrating online surveys.

Spoiler alert: they aren't

Thread 🧵

[1/9]

3 weeks ago 134 63 2 10
Post image

🧵1/ Our first meta-science paper (with 350+ coauthors) is published today in Nature. It presents one of the largest-ever reproducibility projects in economics & political science.

Here’s what we found 👇

3 weeks ago 166 89 2 21

This was fun - thanks to all the participants for great comments :)

4 weeks ago 2 1 0 0

Can AI detect the ten errors in Moretti 2021? I did a test of GPT5.2 vs GPT5.4 vs refine.

Takeaway: current reasoning LLMs are useful, with room for improvement.

1/

1 month ago 22 7 1 0

🚨Replication alert🚨
I'm pleased to announce that my replication of Moretti (2021) is now accepted as a comment at AER.

I find ten issues in the paper. My comment focuses on two major problems; in the appendix, I document eight (relatively) minor problems.

1/

1 month ago 148 50 9 11

Super excited about this...

I'm hiring a video editor to help bring economics to a mass audience (yes, really). If you're thoughtful, creative, and want to make complex ideas accessible, I'd love to see your work.

Apply here: docs.google.com/forms/d/e/1F...

Or share with your most amazing mates!

2 months ago 526 155 18 8

It's ironic to see a discipline care **so much** about unbiasedness (causal inference!) at the level of a single test but then have a research production system and culture that is basically a ferocious bias generation machine. This is not good.

2 months ago 157 25 4 10
Abstract
Al assistance produces significant productivity gains across professional domains, particularly for novice workers. Yet how this assistance affects the development of skills required to effectively supervise AI remains unclear. Novice workers who rely heavily on AI to complete unfamiliar tasks may compromise their own skill acquisition in the process. We conduct randomized experiments to study how developers gained mastery of a new asynchronous programming library with and without the assistance of Al.
We find that AI use impairs conceptual understanding, code reading, and debugging abilities, without delivering significant efficiency gains on average. Participants who fully delegated coding tasks showed some productivity improvements, but at the cost of learning the library. We identify six distinct AI interaction patterns, three of which involve cognitive engagement and preserve learning outcomes even when participants receive AI assistance. Our findings suggest that Al-enhanced productivity is not a shortcut to competence and AI assistance should be carefully adopted into workflows to preserve skill formation - particularly in safety-critical domains.

Abstract Al assistance produces significant productivity gains across professional domains, particularly for novice workers. Yet how this assistance affects the development of skills required to effectively supervise AI remains unclear. Novice workers who rely heavily on AI to complete unfamiliar tasks may compromise their own skill acquisition in the process. We conduct randomized experiments to study how developers gained mastery of a new asynchronous programming library with and without the assistance of Al. We find that AI use impairs conceptual understanding, code reading, and debugging abilities, without delivering significant efficiency gains on average. Participants who fully delegated coding tasks showed some productivity improvements, but at the cost of learning the library. We identify six distinct AI interaction patterns, three of which involve cognitive engagement and preserve learning outcomes even when participants receive AI assistance. Our findings suggest that Al-enhanced productivity is not a shortcut to competence and AI assistance should be carefully adopted into workflows to preserve skill formation - particularly in safety-critical domains.

‘Novice workers who rely heavily on AI to complete unfamiliar tasks may compromise their own skill acquisition… We find that AI use impairs conceptual understanding, code reading, and debugging abilities, without delivering significant efficiency gains on average.’
arxiv.org/pdf/2601.20245

2 months ago 335 144 6 24
Post image

Who leaked this Number 10 discussion to Jeffrey Epstein? And are there consequences for the leaker?

It’s an internal discussion re. getting markets moving in the aftermath of the financial crisis. No doubt of great interest to Epstein and his financial market clients.

2 months ago 424 188 24 42
Advertisement
Preview
Common misperceptions: What people get wrong about the world and why it matters What do people get wrong, i.e. misperceive, about the world? Why do misperceptions matter for economic development? How can fixing misperceptions benefit society?

🆕 There's a growing body of evidence on the common misperceptions people have about the world.

And it turns out that, across a bunch of different settings, correcting those misperceptions seems to be a very cheap way of improving society.

Here are some examples: voxdev.org/topic/common...

3 months ago 4 2 0 0

New evidence from Africa shows that aid reduces conflict when projects are well managed, but increases violence when management and monitoring are weak.

Read today's article to learn more:

3 months ago 11 10 0 0
Weiss Fellowship for Junior Researchers in Low- and Middle-Income Countries - Weiss Fund The Weiss Fund Fellowship provides supplementary financial support for exceptional job market PhD candidates accepting positions in Weiss Fund-eligible countries and doing work aligned with the Weiss ...

The Weiss Fund has a great new initiative for development economists on the PhD job market to support those taking up research positions in LMICs, offering supplementary income + research funds. Please share!

3 months ago 32 28 0 0

Check out our new VoxDevLit on International Migration! Thanks to co-editors @catiabatista.bsky.social, @econgaurav.bsky.social, @dmckenzie.bsky.social, @mushfiq-econ.bsky.social, & Caroline Theoharides!
We look to working together on this "living literature review" in the years to come...

3 months ago 19 10 0 1

These economists are unsurpassed in research on migration & development. Global authorities.

Their new resource at @voxdev.bsky.social is a gift that will keep on giving —>

3 months ago 13 5 0 0

lolsob as I try for the 100th time to convince a biologist that differences in statistical significance are not significant

3 months ago 50 4 3 0
Post image

For folks at the AEA meetings...

Come hear us debate what we do and don't know about the impact of foreign aid.

3 months ago 6 5 1 1

Kicking off 2026 w/ a list of my favorite published dev papers from 2025 #econtwitter #econsky! (Favorite, not best, because best is hard to define - but I loved these papers + learned a lot from them. by "2025", I mean in a journal volume last year)

3 months ago 6 2 1 0
Advertisement

#econsky

4 months ago 0 0 0 0

#econsky

4 months ago 0 0 0 0
The Common Problem of Bad Controls in Tests of the Linguistic Savings Hypothesis. A Comment on Ayres et al. (PNAS, 2023) and related literature – Journal of Comments and Replications in Economics

New in JCRE: The Common Problem of Bad Controls in Tests of the Linguistic Savings Hypothesis. A Comment on Ayres et al. (PNAS, 2023) and related literature by Paul Clist @paulclist.bsky.social jcr-econ.org/the-common-p...

4 months ago 2 1 1 1

Many thanks to my excellent coauthor www.yingyihong.org

As an aside, one of the three papers that spread the idea is Ariely & Gino (12), which I don't think has been retracted, but we discuss some 'interesting' data patterns in our appendix, most notably identical distributions in two treatments

4 months ago 0 0 0 0

At the suggestion of a referee we test mixture models to see if anyone is following JD. We don't find significant evidence they are. Models without JD offer better fit.

So whilst JD is a neat theory, there isn't anything special about counterfactuals. Standard lying models work quite well.

4 months ago 0 0 1 0

we find that whilst it is a neat theory, it doesn't seem to be a good explanation of what's going on. We test this by 1) running a placebo test, where JD's predictions fit behaviour *really well* when they shouldn't, and 2) asking for the second roll and testing a corollary of JD. It doesn't pass.

4 months ago 0 0 1 0
Preview
Dishonesty and justifications: Evidence from the second roll of a dice game The widely-adopted die rolling experiment measures average lying behaviour. Its original design uses so-called control rolls; subjects should roll twi…

Dice games are a popular way of measuring lying and cheating. There's a neat theory, called Justified Dishonesty, where people that observe counterfactuals 'swap' rolls, as they can cheat but feel honest.
We explore that idea here:
www.sciencedirect.com/science/arti...

4 months ago 1 0 1 1
Advertisement
Text reads: About synthetic panels
Recruiting the right participants for a study can be difficult. You may not get the exact demographics you need, and the shorter the deadline, the less sure you can be that everyone will answer on time. One possible solution can be to use synthetic panels.

Synthetic panels are powered by a first party proprietary AI model developed here at Qualtrics. Our synthetic panel is trained on thousands of responses from a variety of demographic backgrounds in order to more accurately predict how certain populations would respond to a survey.

Our synthetic panel is based on the United States General Population, and is only available in English. This panel comes with ready-made quotas and target breakouts in order to represent your chosen population and make it easy to launch your survey right away.

Text reads: About synthetic panels Recruiting the right participants for a study can be difficult. You may not get the exact demographics you need, and the shorter the deadline, the less sure you can be that everyone will answer on time. One possible solution can be to use synthetic panels. Synthetic panels are powered by a first party proprietary AI model developed here at Qualtrics. Our synthetic panel is trained on thousands of responses from a variety of demographic backgrounds in order to more accurately predict how certain populations would respond to a survey. Our synthetic panel is based on the United States General Population, and is only available in English. This panel comes with ready-made quotas and target breakouts in order to represent your chosen population and make it easy to launch your survey right away.

Text reads:
Question-writing best practices
To get the most reliable and actionable results from synthetic audiences, consider these question-writing best practices:

Ask forward-looking and attitudinal questions.
Synthetic panels perform best with perceptions, preferences, and intent-based questions. For example, “How likely are you to try…?”
Synthetic panels are less applicable for studies on past behaviors, detailed recall, brand recall, or awareness questions. For example, “When did you last visit…?”

Text reads: Question-writing best practices To get the most reliable and actionable results from synthetic audiences, consider these question-writing best practices: Ask forward-looking and attitudinal questions. Synthetic panels perform best with perceptions, preferences, and intent-based questions. For example, “How likely are you to try…?” Synthetic panels are less applicable for studies on past behaviors, detailed recall, brand recall, or awareness questions. For example, “When did you last visit…?”

Text reads:
Discussion
The current study aimed to conduct a meta-analysis of the TPB when applied to health behaviours which addressed the limitations of previous reviews by including only prospective tests of behaviour, applying RE meta-analytic procedures, correcting correlations for sampling and measurement error, and hierarchically analysing the effect of behaviour type and sample and methodological moderators. Some 237 tests were identified which examined relations amongst model components. Overall the analysis indicated that the TPB could explain 19.3% of the variance in behaviour and 44.3% of the variance in intention across studies. This level of prediction of behaviour is slightly lower than that of previous meta-analytic reviews which have found between 27% (Armitage & Conner, 2001; Hagger et al., 2002) and 36% (Trafimow et al., 2002)
of the variance in behaviour to be explained by intention and PBC.

Text reads: Discussion The current study aimed to conduct a meta-analysis of the TPB when applied to health behaviours which addressed the limitations of previous reviews by including only prospective tests of behaviour, applying RE meta-analytic procedures, correcting correlations for sampling and measurement error, and hierarchically analysing the effect of behaviour type and sample and methodological moderators. Some 237 tests were identified which examined relations amongst model components. Overall the analysis indicated that the TPB could explain 19.3% of the variance in behaviour and 44.3% of the variance in intention across studies. This level of prediction of behaviour is slightly lower than that of previous meta-analytic reviews which have found between 27% (Armitage & Conner, 2001; Hagger et al., 2002) and 36% (Trafimow et al., 2002) of the variance in behaviour to be explained by intention and PBC.

Did you know that from tomorrow, Qualtrics is offering synthetic panels (AI-generated participants)?

Follow me down a rabbit hole I'm calling "doing science is tough and I'm so busy, can't we just make up participants?"

4 months ago 656 287 38 225

This is really good news for thousands of students - ERASMUS is a fantastic programme that UK students should have kept all along. ERASMUS provides opportunities that without funding many students could never afford. It can only be a net positive.

4 months ago 50 15 0 0
Post image

Free the Best Buys!

www.cgdev.org/blog/fcdos-b...

4 months ago 3 2 0 0
Simulating from and checking a model in Stan: It’s so easy in Stan Playground–it just runs on your browser! | Statistical Modeling, Causal Inference, and Social Science

Simulating from and checking a model in Stan: It’s so easy in Stan Playground–it just runs on your browser!
statmodeling.stat.columbia.edu/2025/12/15/s...

4 months ago 8 2 0 0
Nottingham's Centre for Decision Research and Experimental Economics celebrates its 25th anniversary

Experimental economics now has a substantial track record
#econsky #academicsky
marketdesigner.blogspot.com/2025/12/nott...

4 months ago 10 3 1 0

Wired: two article proofs to check, received on the same afternoon
Tried: In the week before Christmas, whilst packing up my office with a cold

4 months ago 0 0 0 0