Matt DiGiuseppe (@mdigiuseppe) Bsky

My agent is working on a replication with more recent open-source and frontier models. I'll post an addendum soon. I am seeing that the same pattern holds. Pairwise comparisons are preferable to naive scaling even with GPT 5.4

3 weeks ago 0 0 0 0

Using pairwise comparisons and then fitting a Bayesian Bradley-Terry model to retrieve a coefficient for each piece of text reduces measurement error, provides more consistent results across LLMs or various sizes, and allows for the incorporation of uncertainty in downstream analyses.

3 weeks ago 0 0 1 0

We show that pairwise comparisons are a superior way to scale variables from text (open-ended questions here) with LLMs. Asking LLMs to judge a concept on a pre-determined scale produces unevenly distributed results.

3 weeks ago 0 0 1 0

I am happy that this paper with @flynnpolsci.bsky.social is finally in print.

If you need to turn text into a number for rigorous analysis, this is likely the solution you are looking for.

Short story: pairwise comparisons (A vs. B) are better than naive LLM scaling (0-10 placement).

3 weeks ago 5 3 1 0

Europe's rigid labor markets are an economic death sentence Why has Europe slowed down relative to the US?

open.substack.com/pub/pricethe...

1 month ago 0 0 0 0

Academia's Class Problem: First-Generation Scholars in Political Science Political scientists devote a massive amount of attention to socioeconomic background of political actors and descriptive representation in political institutio

Paper w/ @carogarriga.bsky.social: Academia’s Class Problem. PoliSci is dominated by the upper middle class / people with parents who went to university – unlike society as a whole.
dx.doi.org/10.2139/ssrn...

1 month ago 22 15 1 1

@polscileiden.bsky.social

1 month ago 0 0 0 0

Perceived Political Bias in LLMs Reduces Persuasive Abilities Conversational AI has been proposed as a scalable way to correct public misconceptions and spread misinformation. Yet its effectiveness may depend on perceptions of its political neutrality. As LLMs e...

The broader point: LLMs can persuade, but once politics enters the picture, their power runs into real limits.

Paper: arxiv.org/abs/2602.18092

1 month ago 2 2 1 0

But there’s a catch: if people think ChatGPT is politically biased, persuasion drops. Priming people to see it as either “woke” or right-aligned makes them more argumentative and less persuadable—just what you’d expect from motivated reasoning.

1 month ago 1 0 1 0

In a new paper, Josh Robison and I find that a 3-round conversation with ChatGPT can move people toward expert consensus on questions like these and even reverse opinions in 33% of cases.

1 month ago 1 0 1 0

Do you think rent control improves housing quality and supply? Or that government debt works just like household debt?

Economists generally think those views are wrong. And they're probably right.

1 month ago 3 0 1 0

AI Literacy for Studying, Thesis Research, and Life A practical, no-hype intro to AI literacy for studying, thesis research, and careers. We’ll cover how modern AI works (and where it fails), how to use it to learn faster without losing skills, and how...

📢 New lecture: 'AI Literacy for Studying, Thesis Research, and Life' by @mdigiuseppe.bsky.social 🔎

📆 Wednesday 11 March 2026, 17:15 - 18:30
📍Room 3B.38, Spui Campus, The Hague

➡️More info and registration here: www.universiteitleiden.nl/en/events/20...

1 month ago 2 2 0 0

In-class assessment is the only way forward.

5 months ago 1 0 1 0

Universities can survive if they can assure students a degree can't be attained with prompting alone.

5 months ago 1 0 1 0

Formal models will make a comeback in social science

5 months ago 0 0 1 0

The only people who should be snoozing on this are those with tenure. If you thought the productivity race was bad before…

5 months ago 2 1 1 0

Three Years from GPT-3 to Gemini 3 From chatbots to agents

“Three years ago, we were impressed that a machine could write a poem about otters. Less than 1,000 days later, I am debating statistical methodology with an agent that built its own research environment.

open.substack.com/pub/oneusefu...

5 months ago 3 0 1 0

In the past few months I've spoken to a lot of people facing objections to using chatbots, including a surprising number of people who want to buy chatbot access for their large organizations and have been shot down because of worry over the impact of individual prompts. I think it's crazy that this is still happening and want a much shorter post readers can send to people who are still misinformed about this. Here it is!

A short summary of my argument that using ChatGPT isn't bad for the environment - Andy Masley andymasley.substack.com/p/a-short-summ… #AI #environment #energy

5 months ago 5 3 1 1

Immigration is popular considering the alternatives - doi.org/10.1080/1350...

5 months ago 2 1 0 0

Researchers find methane leaking out of cracks in Antarctic seabed One of the most potent greenhouse gas emissions has been discovered seeping out of cracks of the Antarctic sea floor, researchers have found.

Your 'moment of doom' for Oct. 11, 2025 ~ Everything is fine.

"The invisible gas can be seen in streams of bubbles originating on the seafloor of Antarctica's Ross Sea ... describing the mechanism as 'seemingly widespread' throughout the region... "

abcnews.go.com/Internationa...

6 months ago 205 99 16 23

A small number of samples can poison LLMs of any size Anthropic research on data-poisoning attacks in large language models

Anthropic AI tries to cheer us up a bit www.anthropic.com/research/sma...

6 months ago 46 18 2 3

Criticizing Charlie Kirk is a fireable offense that incites domestic terrorism. But calling political opponents "the party of hate, evil, and Satan" is proper and good. Got it.

6 months ago 1337 340 49 40

Very grateful for the invite to this conference. I learned a lot.

6 months ago 3 0 0 0

6 months ago 0 0 0 0

ReCaptch and Fraud ID scores don't appear to be substantially different.

6 months ago 0 0 1 0

Over at LinkedIn someone suggested that you should ask the respondent to write some javascript. completing the task should be a good sign it is not a human. Until someone prompts around that.

6 months ago 0 0 1 0

It doesn't appear to be misclicking by humans. There were 4 options, only two were clicked (human, AI). Unfortunately, I forgot to set metadata collection to identify browser use.

6 months ago 1 0 1 0

Yesterday I posted about how I used comet browser to take a Qualtrics survey almost undetected. Last night, I ran a pilot (N=400) on @joinprolific.bsky.social . I found that almost 10% of "respondents" identified as AI when directly asked.

6 months ago 22 10 1 2

Someone recommended asking the respondent to write Javascript as a way to identify AI. I'll try this out in my next pilot.

6 months ago 0 0 0 0

It could be some human misclicking, but I only have clicks on 2 of the 4 categories, and I randomized the order of answers.

[Unfortunately, I forgot to collect the metadata on browser use]

6 months ago 0 0 1 0

Posts by Matt DiGiuseppe