Advertisement ยท 728 ร— 90

Posts by Ziling Cheng

Post image

How can we use models of cognition to help LLMs interpret figurative language (irony, hyperbole) in a more human-like manner? Come to our #ACL2025NLP poster on Wednesday at 11AM (exhibit hall - exact location TBA) to find out! @mcgill-nlp.bsky.social @mila-quebec.bsky.social @aclmeeting.bsky.social

8 months ago 3 2 0 0
Post image

What do systematic hallucinations in LLMs tell us about their generalization abilities?

Come to our poster at #ACL2025 on July 29th at 4 PM in Level 0, Halls X4/X5. Would love to chat about interpretability, hallucinations, and reasoning :)

@mcgill-nlp.bsky.social @mila-quebec.bsky.social

8 months ago 2 2 0 0
Post image

A blizzard is raging through Montreal when your friend says โ€œLooks like Florida out there!โ€ Humans easily interpret irony, while LLMs struggle with it. We propose a ๐˜ณ๐˜ฉ๐˜ฆ๐˜ต๐˜ฐ๐˜ณ๐˜ช๐˜ค๐˜ข๐˜ญ-๐˜ด๐˜ต๐˜ณ๐˜ข๐˜ต๐˜ฆ๐˜จ๐˜บ-๐˜ข๐˜ธ๐˜ข๐˜ณ๐˜ฆ probabilistic framework as a solution.
Paper: arxiv.org/abs/2506.09301 to appear @ #ACL2025 (Main)

9 months ago 15 7 1 4

๐Ÿ™ Huge thanks to my collaborators @mengcao.bsky.social, Marc-Antoine Rondeau, and my advisor Jackie Cheung for their invaluable guidance and support throughout this work, and to friends at @mila-quebec.bsky.social and @mcgill-nlp.bsky.social ๐Ÿ’™ 7/n

10 months ago 3 0 0 0

๐Ÿง  TL;DR: These irrelevant context hallucinations show that LLMs go beyond mere parroting ๐Ÿฆœ โ€” they do generalize, based on contextual cues and abstract classes. But not reliably. They're more like chameleons ๐ŸฆŽ โ€” blending with the context, even when they shouldnโ€™t. 6/n

10 months ago 8 0 1 1

๐Ÿ” Whatโ€™s going on inside?
With mechanistic interpretability, we found:
- LLMs first compute abstract classes (like โ€œlanguageโ€) before narrowing to specific answers
- Competing circuits inside the model: one based on context, one based on query. Whichever is stronger wins. 5/n

10 months ago 3 0 1 0

Sometimes this yields the right answer for the wrong reasoning (โ€œPortugueseโ€ from โ€œBrazilโ€), other times, it produces confident errors (โ€œJapaneseโ€ from โ€œHondaโ€). 4/n

10 months ago 1 0 1 0
Post image

Turns out, we can. They follow a systematic failure mode we call class-based (mis)generalization: the model abstracts the class from the query (e.g., languages) and generalizes based on features from the irrelevant context (e.g., Honda โ†’ Japan). 3/n

10 months ago 6 0 1 0
Post image

These examples show answers โ€” even to the same query โ€” can shift under different irrelevant contexts. Can we predict these shifts? 2/n

10 months ago 9 0 1 0
Advertisement
Post image

Do LLMs hallucinate randomly? Not quite.

Our #ACL2025 (Main) paper shows that hallucinations under irrelevant contexts follow a systematic failure mode โ€” revealing how LLMs generalize using abstract classes + context cues, albeit unreliably.

๐Ÿ“Ž Paper: arxiv.org/abs/2505.22630 1/n

10 months ago 46 18 1 3