MaiNLP lab, LMU Munich (@mainlp) Bsky

MaiNLP talks/posters/events at EACL

MaiNLP is happy to be part of @eaclmeeting.bsky.social with several papers, talks, a panel, and a workshop ☀️ Looking forward to seeing you in Rabat! #EACL2026

4 weeks ago 7 2 0 1

We are honoured to welcome Prof Barbara Plank (@barbaraplank.bsky.social ) from @mainlp.bsky.social @cislmu.bsky.social, as our keynote speaker.

LoResLM @eaclmeeting.bsky.social

5 months ago 7 2 0 0

Principled Personas: Defining and Measuring the Intended Effects of Persona Prompting on Task Performance Expert persona prompting -- assigning roles such as expert in math to language models -- is widely used for task improvement. However, prior work shows mixed results on its effectiveness, and does not...

📢 New paper accepted at @eaclmeeting.bsky.social
2026:

Persistent Personas? Role-Playing, Instruction Following, and Safety in Extended Interactions

with
@mhedderich.bsky.social
@amodarressi.bsky.social
Hinrich Schuetze
& Benjamin Roth.

Preprint: arxiv.org/abs/2512.12775

2 months ago 3 2 1 1

In our new paper, "A Comprehensive Evaluation of Multilingual Chain-of-Thought Reasoning: Performance, Consistency, and Faithfulness Across Languages", we go beyond final-answer accuracy to analyze multilingual reasoning along three dimensions: performance, consistency, and faithfulness.

4 weeks ago 5 1 1 1

✨New paper✨

We find script (e.g. Cyrillic, Latin) to be a linear direction in the activation space of Whisper, enabling transliteration at test-time by adding such script directions to the activations — producing e.g. Cyrillic Japanese transcriptions.

3 months ago 10 5 1 0

VarDial @ EACL 2026, with important dates (see next post for text version). Photo CC-0.

VarDial 2026 will be colocated with @eaclmeeting.bsky.social! We're looking forward to your papers on NLP for similar languages, varieties and dialects :)

Deadline: Dec 19 (Jan 2 for pre-reviewed ARR papers)
sites.google.com/view/vardial...

6 months ago 14 10 1 1

Group photo at NeurIPS 2025 San Diego

4 months ago 9 1 0 0

Congrats to Pingjun, @beiduo.bsky.social , Siyao, Marie, and @barbaraplank.bsky.social for receiving the SAC Highlights reward!

5 months ago 5 1 0 0

Congrats to our team member Diego Frassinelli on the SAC Highlights award!

5 months ago 1 0 0 0

Awesome! We're also creating one currently and have included yours as a starter :)

8 months ago 2 0 1 0

Piper title ("A multi-dialectal dataset for German dialect ASR and dialect-to-standard speech translation") and a map of the German state Bavaria showing where the Franconian, Bavarian, and Alemannic dialect groups are spoken

At #Interspeech2025 I'm going to present Betthupferl, a dataset for German dialect ASR & dialect-to-standard speech translation! We analyze differences between dialectal & Standard German transcriptions, benchmark ASR models, and examine shortcomings of current ASR models & evaluation metrics.

8 months ago 16 4 1 1

UPDATE: Our poster presentation got moved to Tuesday, 16:00–17:30 (session 10)! #ACL2025NLP

8 months ago 3 1 0 0

Unsure which presentations to attend at #ACL2025? 🛎️🗣️

8 months ago 4 2 0 0

👥‪ @boleima.bsky.social Yuting Li, Wei Zhou, Ziwei Gong, @janetlauyeung.bsky.social Katja Jasinskaja @annefriedrich.bsky.social Julia Hirschberg, Frauke Kreuter @barbaraplank.bsky.social

8 months ago 0 0 0 0

👥‪ @boleima.bsky.social Berk Yoztyurk @carohaensch.bsky.social @xinpeng.bsky.social Markus Herklotz, Frauke Kreuter @barbaraplank.bsky.social @assenmacher.bsky.social

8 months ago 1 0 0 0

📝 Analyzing the Effect of Linguistic Similarity on Cross-Lingual Transfer: Tasks and Experimental Setups Matter
🔎 263 languages, 10 similarity measures, 3 NLP tasks
👥 @verenablaschke.bsky.social Masha Fedzechkina @maartjeterhoeve.bsky.social
🔗 arxiv.org/abs/2501.14491
📁 Findings – long

8 months ago 0 0 0 0

📝Do LLMs Give Psychometrically Plausible Responses in Educational Assessments?
🔎Analyzing how human-like LLMs are when taking reading, history, and economics tests
👥 @saeub.bsky.social , Diego Frassinelli, @barbaraplank.bsky.social
🔗 arxiv.org/abs/2506.09796
📁BEA workshop - Long

8 months ago 2 1 1 0

📝 GerMedIQ: A Resource for Simulated and Synthesized Anamnesis Interview Responses in German
🔎 We release a novel German anamnesis question-response dataset with human-simulated and LLM-augmented responses.
👥 @JHofenbitzer et al.
🔗 github.com/Jhofenbitzer...
📁SRW - Long

8 months ago 0 0 1 0

📝Probing LLMs for Multilingual Discourse Generalization Through a Unified Label Set
🔎Do LLMs encode and generalize discourse knowledge across languages?
👥 @florian-eichin.com @janetlauyeung.bsky.social @mhedderich.bsky.social @barbaraplank.bsky.social
🔗 arxiv.org/abs/2503.10515
📁Main - Long

8 months ago 3 1 1 1

📝LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
🔎We present a large-scale study of whether LLM judgments can be reliably used as proxies for human judgments
👥Anna Bavaresco et al.
🔗 arxiv.org/abs/2406.18403
📁Main - Short

8 months ago 0 0 1 0

📝 What's the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns
👥 @mhedderich.bsky.social Anyi Wang @raoyuan.bsky.social @florian-eichin.com Jonas Fischer @barbaraplank.bsky.social  
🔗 arxiv.org/abs/2504.158... 
📁Main - Long

8 months ago 2 1 1 0

📝A Rose by Any Other Name: LLM-Generated Explanations Are Good Proxies for Human Explanations to Collect Label Distributions on NLI
👥 @beiduo.bsky.social Siyao Peng @annakorhonen.bsky.social @barbaraplank.bsky.social
🔗 arxiv.org/abs/2412.13942
📁ACL25 Findings-Long

8 months ago 0 0 1 0

📝Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
🔎We study the relationship between circuits for highly compositional and functionally related tasks
👥@pmondorf.bsky.social Sondre Wold @barbaraplank.bsky.social
🔗 arxiv.org/abs/2410.01434
📁Main-Long

8 months ago 0 0 1 0

📝Pragmatics in the Era of Large Language Models: A Survey on Datasets, Evaluation, Opportunities and Challenges
🔎We review existing datasets for evaluating LLMs’ pragmatic capabilities, outlining key challenges and promising future directions
🔗 arxiv.org/abs/2502.12378
📁Main - Long

8 months ago 0 0 2 0

📝Algorithmic Fidelity of Large Language Models in Generating Synthetic German Public Opinions: A Case Study
🔎This study evaluates LLMs in generating German public opinions using open-ended survey data
🔗 arxiv.org/abs/2412.13169
📁Main - Long

8 months ago 0 0 2 0

Headed to ACL? MaiNLP & our most recent work will be there too👥📄
Come see what we’ve been working on!

8 months ago 14 5 1 2

Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models A fundamental question in interpretability research is to what extent neural networks, particularly language models, implement reusable functions through subnetworks that can be composed to perform mo...

📄 [ACL 2025 main] Circuit compositions: Exploring Modular Structures in Transformer-Based Language Models (doi.org/10.48550/arX...)

9 months ago 5 2 1 0

LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks There is an increasing trend towards evaluating NLP models with LLMs instead of human judgments, raising questions about the validity of these evaluations, as well as their reproducibility in the case...

📄 [ACL 2025 main] LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks (doi.org/10.48550/arX...)

9 months ago 10 4 1 0

Correlations between transfer results per experiment (parsing, POS tagging, topic classification with different input representations) and similarity measures. The results vary a lot across experiments and measures – some are described in the next posts.

At #ACL2025NLP I'll present our analysis of the effect of linguistic similarity on cross-lingual transfer! We looked at how 10 similarity measures correlate w/ transfer results btwn 263 languages across 3 NLP tasks. Different similarity measures matter for diff. experiments (no one-size-fits-all)!

9 months ago 21 1 1 1

🤔 Can LLMs read between the lines?

Our another #ACL2025 paper surveys resources on how LLMs handle pragmatics like implicatures, deixis, and more. We map out a new landscape for both LLMs and linguistics in pragmatic research.

📄 arxiv.org/abs/2502.12378
🧠💬 #LLMs #Pragmatics

9 months ago 16 4 1 1

Posts by MaiNLP lab, LMU Munich