Jana Jung (@janajung) Bsky

Thanks a lot for raising this! In our case, we specifically selected the downstream tasks based on empirical findings from humans linking the questionnaires and their constructs to the corresponding behaviors.

1 month ago 0 0 0 0

Do Psychometric Tests Work for Large Language Models? Evaluation of Tests on Sexism, Racism, and Morality Psychometric tests are increasingly used to assess psychological constructs in large language models (LLMs). However, it remains unclear whether these tests -- originally developed for humans -- yield...

📄 Paper: arxiv.org/abs/2510.11254

A very big thank you to my amazing collaborators @marlutz.bsky.social, @indiiigo.bsky.social, and @mstrohm.bsky.social!

3/3

1 month ago 1 0 0 0

For all 3 constructs we looked at –sexism, racism, and morality– the correlations between tests scores and behavior in a related downstream task are only weak positive, or even negative.

📢 Our results call for LLM-specific evaluations instead of applying tests originally developed for humans.

2/3

1 month ago 1 0 1 0

Are you using survey-style questionnaires designed for humans to measure characteristics of LLMs?

In our #EACL2026 paper, we evaluate both the reliability and validity of such tests and found that their scores do not reflect real-world model behavior. In fact, they can be deceptive!

🧵1/3

1 month ago 7 3 2 1

🚨New paper alert🚨

🤔 Ever wondered how the way you write a persona prompt affects how well an LLM simulates people?

In our #EMNLP2025 paper, we find that using interview-style persona prompts makes LLM social simulations less biased and more aligned with human opinions.
🧵1/7

5 months ago 13 3 1 2

Thrilled to talk about how seemingly small decisions in silicon sampling can have a large impact on simulated survey responses 👀 Join us on Oct 29th! 👈

6 months ago 14 6 0 0

👋 #ACL2025NLP 🇦🇹 @marlutz.bsky.social and I are presenting our poster on demographic representativeness of LLMs today!

🕦 10:30-12:00
📍 Hall X5 (board 1 or 14 according to different sources 🧐)

Here’s the paper on ACL anthology: aclanthology.org/2025.finding...

Drop by!

8 months ago 22 7 0 1

Chair for Data Science in the Economic and Social Sciences at University of Mannheim having lots of fun at #ic2s2 @janajung.bsky.social @wanlo.bsky.social @indiiigo.bsky.social @jrupprec.bsky.social @maximiliankreutner.bsky.social and Stefano Balietti

8 months ago 21 4 0 0

Poster Session 1 is live in the Atrium! Explore the work and cast your daily vote—just scan the QR code to submit your favorite poster ID. Everyone has one vote per day. #ic2s2

9 months ago 13 2 0 0

LLMs can generate synthetic survey responses, e.g. for imputation, but how reliable are they? 📋

At #IC2S2, I'll be sharing our research on the robustness of AI-generated responses to perturbations and if they mirror human survey biases. 🤖
Come by my poster on Tuesday between 1:30 and 3:30 p.m.

9 months ago 7 3 0 0

Really excited to also present this work at #IC2S2 next week in Norrköping! 🎉 I'd love to discuss how to produce LLM survey responses at my poster on Wed at 13:30 (Poster Session 2, Poster ID 68) 📊

9 months ago 18 6 0 0

Very excited to head to #IC2S2 next week! 🎉

In our project, we tested whether a psychological assessment can measure sexism in LLMs, and found that applying such tools to LLMs is not as straightforward as it seems.

Find me and my poster at Poster Session 1 (Tue 12:30-14:30) — hope to see you there

9 months ago 9 4 0 0

Posts by Jana Jung