A blizzard is raging through Montreal when your friend says โLooks like Florida out there!โ Humans easily interpret irony, while LLMs struggle with it. We propose a ๐ณ๐ฉ๐ฆ๐ต๐ฐ๐ณ๐ช๐ค๐ข๐ญ-๐ด๐ต๐ณ๐ข๐ต๐ฆ๐จ๐บ-๐ข๐ธ๐ข๐ณ๐ฆ probabilistic framework as a solution.
Paper: arxiv.org/abs/2506.09301 to appear @ #ACL2025 (Main)
Posts by McGill NLP
"Build the web for agents, not agents for the web"
This position paper argues that rather than forcing web agents to adapt to UIs designed for humans, we should develop a new interface optimized for web agents, which we call Agentic Web Interface (AWI).
arxiv.org/abs/2506.10953
Excited to share the results of my recent internship!
We ask ๐ค
What subtle shortcuts are VideoLLMs taking on spatio-temporal questions?
And how can we instead curate shortcut-robust examples at a large-scale?
We release: MVPBench
Details ๐๐ฌ
Exciting work on hallucinations from @ziling-cheng.bsky.social
Incredibly proud of my students @adadtur.bsky.social and Gaurav Kamath for winning a SAC award at #NAACL2025 for their work on assessing how LLMs model constituent shifts.
Congratulations to Mila members @adadtur.bsky.social , Gaurav Kamath and @sivareddyg.bsky.social for their SAC award at NAACL! Check out Ada's talk in Session I: Oral/Poster 6. Paper: arxiv.org/abs/2502.05670
Presenting โจ ๐๐๐๐๐: ๐๐๐ง๐๐ซ๐๐ญ๐ข๐ง๐ ๐๐ก๐๐ฅ๐ฅ๐๐ง๐ ๐ข๐ง๐ ๐ฌ๐ฒ๐ง๐ญ๐ก๐๐ญ๐ข๐ ๐๐๐ญ๐ ๐๐จ๐ซ ๐๐ฏ๐๐ฅ๐ฎ๐๐ญ๐ข๐จ๐ง โจ
Work w/ fantastic advisors Dima Bahdanau and @sivareddyg.bsky.social
Thread ๐งต:
Overview figure for paper, showing creation of constituent movement data, in addition to three step experimentation: "Model Shifting Preference", "Motivating Factors of Model Preference", "Human-Model Preference Correlation"
Super excited to finally announce our NAACL 2025 main conference paper โLanguage Models Largely Exhibit Human-like Constituent Ordering Preferencesโ!
We examine constituent ordering preferences between humans and LLMs; we present two main findingsโฆ ๐งต
At McGill we have an NLP lab that works on a lot of things, from human-AI collaboration, to evaluation, to low resource NLP (me).
@emnlpmeeting.bsky.social just happened in Miami, and my colleagues just presented six papers there:
Thank you for trying again! I haven't a solution to the search issue and might contact support soon. Will let you know once we're indexed!
The Causal Influence of Grammatical Gender on Distributional Semantics
Featuring @karstanczak.bsky.social
arxiv.org/abs/2311.18567
Benchmarking Vision Language Models for Cultural Understanding
Featuring @karstanczak.bsky.social
arxiv.org/abs/2407.10920
Does This Summary Answer My Question? Modeling Query-Focused Summary Readers with Rational Speech Acts
By @cesare-spinoso.bsky.social
arxiv.org/abs/2411.06524
It turns out we had even more papers at EMNLP!
Let's complete the list with three more๐งต
From Local Concepts to Universals: Evaluating the Multicultural Understanding of Vision-Language Models
By @meharbhatia.bsky.social
aclanthology.org/2024.emnlp-m...
Social Bias Probing: Fairness Benchmarking for Language Models
By @karstanczak.bsky.social
aclanthology.org/2024.emnlp-m...
From Insights to Actions: The Impact of Interpretability and Analysis Research on NLP
By @mariusmosbach.bsky.social
aclanthology.org/2024.emnlp-m...
Our lab members recently presented 3 papers at @emnlpmeeting.bsky.social in Miami โ๏ธ ๐
From interpretability to bias/fairness and cultural understanding -> ๐งต
Hello ๐ could you add us? Great initiative!