Jenna Russell (@jennarussell) Bsky

Hard to say absolute best, but Claude definitely has a more unique writing style than the rest. Some of this could be the result of higher api pricing, so open models less likely to train off of large amounts of Claude generated text? Also partly Claude is better lol (imo)

2 weeks ago 1 0 1 0

Joint work with Rishanth Rajendhran, @miyyer.bsky.social, and John Wieting. Thanks also to UMD Clip for all the support!

2 weeks ago 2 0 0 0

GitHub - jenna-russell/storyscope: This repo holds the code for the paper "StoryScope: Investigating idiosyncrasies in AI fiction". This repo holds the code for the paper "StoryScope: Investigating idiosyncrasies in AI fiction". - jenna-russell/storyscope

We're releasing StoryScope code, all 10,272 prompts, 51,336 AI-generated narratives (~5k words each), and per-story features to support future work on narrative analysis & AI authorship.

📝 arxiv.org/abs/2604.03136
🔗 github.com/jenna-russe...

2 weeks ago 7 1 2 0

Style-based detectors' performance will vary as models evolve (GPT already cut the em-dash). Narrative structure is harder to humanize, changing it requires significant rewriting instead of post-hoc edits. StoryScope can be used as a more durable basis of authorship analysis.

2 weeks ago 7 0 1 0

Narrative features are robust to stylistic editing. When we ran a subset stories over the LAMP protocol (e.g., rewriting cliches, purple prose) detection only dropped from 95.5% -> 93.9% macro F1. Surface-level 'humanization' doesn't fix structural tells.

2 weeks ago 8 0 1 0

🟢 Gemini has the tidiest endings and bleakest settings (88% bleak).
🟠 DeepSeek likes to front-load crucial context (humans and others hold it until the end).
🟣 Kimi is the most generic, with few choices distinctive from other models.

2 weeks ago 7 0 1 0

Each model has its own fingerprint
🔴 Claude keeps it cool - flat event escalations, follows traditional literary tropes, and likes epilogues, producing consistent, careful stories.
🔵 GPT is a gossip: uses rumors as plot devices, frames stories as decades-old retrospectives.

2 weeks ago 6 0 1 0

The five AI models cluster together in narrative space, distinctly separated from human writing. Human stories are also rarer, based on neighbor distance in narrative space. 24.7% of human stories fall in the rarest 10% of the corpus, vs 7.1% of AI stories.

2 weeks ago 8 1 1 0

Humans ...
- Love nonlinear structures (flashbacks and time jumps)
- Reference real texts and authors at ~2x the AI rate
- Frames protagonists as more morally ambivalent (59% vs. 38%)
- Diverse features: more characters, more dialogue, and more subplots (42%)

2 weeks ago 16 5 1 0

What separates AI from humans? AI ...
- over-explains its themes
- narrators spell out the moral (77% vs 52% for humans)
- AI favors clean, single-track plots (79% have no subplots)
- over-writes the body; emotion is conveyed as 'tight chests & cold sweats' (81% vs. 38%).

2 weeks ago 15 3 1 1

On simple XGBoost models, our narrative features hit 93.2% macro-F1 (0.96 AUPRC) for the human vs AI detection task, keeping 97% of the performance of our model using narrative and stylistic cues. Just 30 'core' narrative features capture the majority of the signal.

2 weeks ago 7 0 1 0

We introduce StoryScope, a pipeline to extract interpretable narrative features (e.g., plot, character, revelation) across 60k+ stories, written by humans and 5 LLMs (Claude, GPT, Gemini, Kimi, DeepSeek) over the same ~10k prompts.

2 weeks ago 10 0 1 0

Would you realize if the book you were reading was AI? What if it was humanized to remove AI-speak?

We find that even without using stylistic cues (e.g., word choice or sentence structure) narrative choices alone give AI fiction away!

2 weeks ago 199 63 8 6

Thanks to my amazing coauthors
@markar.bsky.social, Destiny Akinode, @kthai1618.bsky.social, Bradley Emi, Max Spero and @miyyer.bsky.social and the support of UMD Clip lab and Pangram Labs

6 months ago 1 0 0 0

We will be continuously monitoring American news to keep up with how AI use changes over time. Follow along at 🌐 ainewsaudit.github.io

6 months ago 2 1 1 1

AI use in American newspapers is widespread, uneven, and rarely disclosed AI is rapidly transforming journalism, but the extent of its use in published newspaper articles remains unclear. We address this gap by auditing a large-scale dataset of 186K articles from online...

We’re releasing:
🌐 Browse articles: ainewsaudit.github.io
📂 Datasets (recent_news, opinions, ai_reporters): github.com/jenna-russe...
📄 Paper: arxiv.org/abs/2510.18774

6 months ago 8 3 1 0

AI has been creeping into the news all of us read, often without any disclosure. We call for clearly defined standards for U.S. newsrooms:
1️⃣ Clearly define what counts as acceptable use of AI and publish these standards openly
2️⃣ Require AI-use attestations for all writers

6 months ago 15 4 1 0

Many AI-written stories still contain authentic quotes. We hypothesize that people often use AI for editing or expanding on their human-written work. But with no disclosure, there's no way to tell for sure.

6 months ago 2 0 1 0

We also track how AI adoption has evolved over time:
Among 10 veteran reporters we followed longitudinally, AI use rose from 0% pre-ChatGPT (2022) to >40% in 2025.

6 months ago 0 0 1 0

AI is disproportionately affecting news written in languages other than English. Roughly ~8% of English news is AI-generated, compared to 33% of non-English languages (primarily Spanish). Without disclosure, we cannot be sure whether AI is translating stories or writing them.

6 months ago 0 0 1 0

In NYT, WaPo & WSJ, opinion sections show 6.4× higher AI use than other sections, rising ~25× since 2022 (from ~0% → ~4%).
AI use is concentrated among prominent guest authors: politicians, CEOs, and scientists.

6 months ago 4 2 1 2

Despite widespread use, transparency is basically nonexistent.
Out of 100 AI-flagged articles we manually annotated, only 5 disclosed that AI was used and over 90% of outlets have no public AI policy.

6 months ago 3 0 1 0

AI use isn’t evenly distributed:
🗞️ Far higher in small local papers than national outlets
🌎 Especially common in Mid-Atlantic & Southern states
🏢 Largely Driven by ownership groups (e.g. Boone Newsmedia & Advance Publications)
🧭 Most concentrated in weather, tech, and health

6 months ago 3 4 1 0

Pangram Labs AI Detection The most accurate technology to detect AI-generated content. Detects ChatGPT, Gemini, Meta AI, Claude, and more. Supports 20+ languages with 99.98%+ accuracy.

We detect AI using Pangram, a model with a reported false positive rate of 0.001% on news text. We find that 5.2% of recent news Is completely AI-generated, with another 3.9% partially AI-generated. www.pangram.com/

6 months ago 2 1 1 0

AI is already at work in American newsrooms.

We examine 186k articles published this summer and find that ~9% are either fully or partially AI-generated, usually without readers having any idea.

Here's what we learned about how AI is influencing local and national journalism:

6 months ago 56 29 5 2

🤔 What if you gave an LLM thousands of random human-written paragraphs and told it to write something new -- while copying 90% of its output from those texts?

🧟 You get what we call a Frankentext!

💡 Frankentexts are surprisingly coherent and tough for AI detectors to flag.

10 months ago 35 8 1 1

International students will stop coming to American universities if their visas are going to be at risk. This will make our intellectual community poorer and also make tuition more expensive for domestic students.

1 year ago 595 165 7 16

There is a quasi-religion in Silicon Valley that views AI as godlike. This faith has always been parallel to Evangelical Christianity: salvation (transhumanism), the rapture (the technological singularity), and demons (Roko's Basilisk)

Lately the AI faith has fully fused with Christian Nationalism.

1 year ago 5977 1418 101 256

Introducing 🐻 BEARCUBS 🐻, a “small but mighty” dataset of 111 QA pairs designed to assess computer-using web agents in multimodal interactions on the live web!
✅ Humans achieve 85% accuracy
❌ OpenAI Operator: 24%
❌ Anthropic Computer Use: 14%
❌ Convergence AI Proxy: 13%

1 year ago 11 5 1 3

Is the needle-in-a-haystack test still meaningful given the giant green heatmaps in modern LLM papers?

We create ONERULER 💍, a multilingual long-context benchmark that allows for nonexistent needles. Turns out NIAH isn't so easy after all!

Our analysis across 26 languages 🧵👇

1 year ago 14 5 1 3

Posts by Jenna Russell