Advertisement · 728 × 90

Posts by Andrew Gordon

We have a new preprint that underscores some key claims here: even if one *can* design an agent that gets through a survey fine, it doesn't follow that such agents are undetectable or common. We find that they are far from common! Preprint link in thread👇

6 days ago 14 7 1 0
Preview
Research Scientist Remote, US, East Coast

If this sounds like the research career opportunity you've been waiting for, I'd love to hear from you. Apply via the link or DM me 👇 [4/4]

US applicants: job-boards.eu.greenhouse.io/prolific/job...
UK applicants: job-boards.eu.greenhouse.io/prolific/job...

1 week ago 0 0 0 0

We're looking for someone with:
• A PhD in a relevant quant field
• Experience in online research methods — sampling, data quality, synthetic data
• A track record of publications or public outputs
• US east coast preferred, but open to UK also
• Postdoc/industry experience preferred
[3/4]

1 week ago 0 0 1 0

The hire will own a dedicated research portfolio, design an agenda, publish in high-impact journals, collaborate with leading researchers, and present at conferences.

This is an IC role and comes with real autonomy to research the most interesting questions concerning online research! [2/4]

1 week ago 1 0 1 0

Job ad: I'm hiring for a Research scientist to join my team at @joinprolific.bsky.social

If you've ever wondered who's working on the hard questions in online research — data quality, sampling methodology, the effect of AI on how research gets done — this is that job. [1/4]

1 week ago 15 10 1 0

@kwcollins.bsky.social the costs are composited together at the same level of platform type yes

@mrandall.bsky.social correct!

2 weeks ago 0 0 0 0

honestly, I have no idea

still see it mentioned at conferences though

2 weeks ago 0 0 0 0

The takeaway: LLM agents are not infiltrating the platforms researchers actually use. Outside of one platform already notorious for pre-LLM bot problems, they just aren't there.

The human data quality problem on the other hand is large, systematic, and fixable through platform choice.

[9/9]

2 weeks ago 11 1 0 1
Advertisement

And the cost finding:

At a 90% quality threshold, Direct panels cost $8.26 per quality respondent. Marketplace: $74.43.

That is not a typo. "Cheap" platforms are often the most expensive data you can buy.

[8/9]

2 weeks ago 7 1 2 1
Post image

The bigger story is human data quality.

Platform type was a far more consequential predictor than anything agent-related. Direct panels outperformed hybrid, which outperformed marketplace, consistently across nearly all measures.

[7/9]

2 weeks ago 5 1 1 1

And the MTurk detections didn't look like LLM agents. Poor writing quality, fast completions, clustered arrivals. Classic bot behavior.

When we ran real LLM agents through the same survey, they outperformed average humans and vastly outperformed the flagged responses.

[6/9]

2 weeks ago 5 0 1 0
Post image

So are AI agents infiltrating surveys? Not really.

Meaningful detections were almost exclusively on MTurk (11-16%).

Every other platform: at or below 1% for our primary detection flag

[5/9]

2 weeks ago 8 6 1 0

Then we recruited 5,200 respondents across 10 platforms spanning direct (Prolific, CloudResearch Connect, Verasight), hybrid (Dynata, Prodege), and marketplace (Cint, Qualtrics, Purespectrum, Prime Panels), and MTurk.

Same survey for everyone. 7 behavioural quality measures. Full metadata.

[4/9]

2 weeks ago 5 1 1 0
Post image

First we validated our detection methods against real agents (Claude, ChatGPT, Gemini, Perplexity, plus a custom white-hat agent) vs real humans.
Primary method: 100% sensitivity, 100% specificity.
Secondary behavioural battery: 92% sensitivity, 99.2% specificity.

[3/9]

2 weeks ago 5 1 1 0

There's been a lot of alarm about LLM agents polluting survey samples. Capability demos are impressive. But capability is not the same as deployment within an ecosystem.

We wanted to know what's actually happening in the platforms researchers use.

[2/9]

2 weeks ago 5 0 1 0
OSF

New preprint out today (osf.io/preprints/ps...). We tested whether AI agents are actually infiltrating online surveys.

Spoiler alert: they aren't

Thread 🧵

[1/9]

2 weeks ago 134 63 2 10
Post image

Starting today, if an AI agent is detected in your Prolific study, we’ll give you twice the cost of that non-human participant back.

You pay for human data. You expect human data. We’re backing that with our 100% Human Guarantee.

Learn more: www.prolific.com/100-human-gu... #AcademicSky #Research

3 weeks ago 10 3 0 0
Advertisement

Fantastic example of researchers working together (and the utility of rebuttals to published work). I think we all agree that this is an area we need to invest time in, but we also need to be very careful that conclusions/interpretations are warranted from the data we collect.

1 month ago 1 0 1 0

@drbarner.bsky.social you might want to read this bsky.app/profile/ache...

1 month ago 1 0 0 0

Your letter very clearly reads as saying that there are bots, and you use that to question the integrity of online sampling... but you agree with the notion that your measures are not sufficiently establishing to make such a claim. Do you not think an amendment to your published letter is required?

1 month ago 1 0 1 0
Post image

Recently, van der Stigchel and colleagues posted a provocative commentary suggesting that we should be wary of bots in online behavioral data collection (🧵by @cstrauch.bsky.social here: bsky.app/profile/cstr...). But should we? Here is my response letter osf.io/preprints/ps.... 1/5

1 month ago 55 33 6 5
Preview
How to Bot-Proof Your Online Research This series cuts through the noise to give researchers practical, actionable strategies for protecting their work from AI agents and bots

On Thursday I'll be taking part in a roundtable “How real is the LLM threat to online research in academia?” for @joinprolific.bsky.social alongside @davmicrot.bsky.social, Michael Nicholas Stagnaro, and Raluca Rilla.

Sign up here: lnkd.in/em8-MpjN

1 month ago 3 1 0 0
OSF

Interval strongly predicted retention: 1-week → 80% completed all sessions; 4-week → 50%.

Payment had no significant effect.

Participants higher in routine showed better retention; those higher in automaticity showed worse

Read the paper here: osf.io/preprints/ps...

2 months ago 0 0 0 0
Post image

What drives retention in online longitudinal research? We conducted an experiment (N=1,798) on @joinprolific.bsky.social, orthogonally manipulating payment rate (£6–£9/hr + bonus) and session interval (1, 2, or 4 weeks) across five sessions. The findings challenge some common assumptions 👇

2 months ago 7 2 1 0
Post image

Hey @rory-stewart.bsky.social @alastaircampbell2.bsky.social @therestpolitics.bsky.social I asked your favourite question to 1,936 US Adults "Who do you believe is the biggest threat to global order and security?"

33.8% of Americans rate the US as the biggest threat.... 🤯

3 months ago 0 0 0 0

@michaeljkane.bsky.social pop me over a message and will do what I can to help!

3 months ago 0 0 1 0
Advertisement

Second, and most importantly, the economics of it don't make sense, even at $0.05/response. For someone to scale this approach would require multiple user accounts (which we have robust guards against), meaning the break-even point for a bad actor is likely impossible to reach

5 months ago 2 0 0 0

First, the barrier to entry here is really high. Creating a bot like this is not trivial, and required an academic team to design and implement. Current 'naive' agents such as that offered by ChatGPT are simple to catch - track mouse moves, typing speed, or even simple reverse shibboleths

5 months ago 1 0 1 0

Lots of chatter about this paper currently. Its a stark warning, but at present I see this as a stark warning of what might come, not what is happening now. As a research community we need to see it as a call-to-arms to develop new strategies, NOT a call to abandon online sampling. Reasoning below

5 months ago 5 3 1 0

Agreed, its a stark warning, and should be a call-to-arms in terms of the community finding ways to detect such bots. Keen to work together with anyone who's interested to figure that out

5 months ago 3 0 0 0