Advertisement Β· 728 Γ— 90

Posts by CIS, LMU Munich

Post image

Hi all!
We are curating SLAyiNG, a dataset of queer slang. To ensure the quality of the final data, we are asking the community for help with annotation.
Sign up at: docs.google.com/forms/d/e/1F...
If you have further inquiries, feel free to contact either me or @leahirlimann.bsky.social directly 🌈

3 weeks ago 2 2 1 0
The Second Workshop on Language Models for Low-Resource Languages

Talk @ LowResLM: Beyond the standard: NLP for low-resource language varieties
by @barbaraplank.bsky.social
Sun 29 Mar - S. Le Lamrissa 14:00-15:00
loreslm.github.io/program

4 weeks ago 0 0 0 0

VarDial: Workshop on NL for similar languages, varieties and dialects
Yves Scherrer, Noëmi Aepli, @verenablaschke.bsky.social , Tommi Jauhiainen, Nikola Ljubeőić, Preslav Nakov, Jârg Tiedemann, Marcos Zampieri
Sun 29 Mar - S. Le Chellah 9:00-12:30
bsky.app/profile/vere...

4 weeks ago 1 0 1 0
Preview
TeachingNLP @ EACL 2026 The Seventh Workshop on Teaching NLP will be co-located with the 2026 Conference of the European Chapter of the Association for Computational Linguistics in Rabat, Morocco. The one-day workshop will c...

Panel discussion on teaching NLP @ TeachingNLP
@ivanhabernal.bsky.social, Mausam, Hinrich SchΓΌtze , @barbaraplank.bsky.social

Sun 29 Mar - S. La Palmeraie @ 9:30-10:30
sites.google.com/view/teachin...

4 weeks ago 0 1 1 0
Preview
AfricaNLP 2026 About the Workshop

Talk @ AfricaNLP: The emergence or multilingual representations: Tracing linguistic capabilities during language model pretraining
by @barbaraplank.bsky.social
Sat 28 Mar - S. Le Lixus @ 11:20-12:00
sites.google.com/view/african...

4 weeks ago 0 0 1 0
Preview
Controlling Reading Ease with Gaze-Guided Text Generation The way our eyes move while reading can tell us about the cognitive effort required to process the text. In the present study, we use this fact to generate texts with controllable reading ease. Our me...

Controlling reading ease with gaze-guided text generation
[Poster]: Fri 27 Mar - Poster Hall @ 9:00-10:30
@saeub.bsky.social , Darja Jepifanova,
@Diego Frassinelli, @barbaraplank.bsky.social
arxiv.org/abs/2601.17781

4 weeks ago 0 0 1 0

If probable, then acceptable? Understanding conditional acceptability judgments in large language models
[Oral]: Thu 26 Mar - S. Le Lixus @ 09:00-10:30
Jasmin Orth, @pmondorf.bsky.social , @barbaraplank.bsky.social
aclanthology.org/2026.eacl-lo...

4 weeks ago 0 0 1 0

A Comprehensive Evaluation of Multilingual Chain-of-Thought Reasoning: Performance, Consistency, and Faithfulness Across Languages
[Poster]: Wed 25 Mar - Poster Hall @ 16:30-18:00
@raoyuan.bsky.social , Yihong Liu, Hinrich SchΓΌtze, @mhedderich.bsky.social
bsky.app/profile/raoy...

4 weeks ago 0 0 1 0

When meanings meet: Investigating the emergence and quality of shared concept spaces during multilingual language model training
[Oral]: Wed. 25 Mar - S. La Palmeraie @ 16:30-18:00
@fkoerner.bsky.social , @mxij.me , Anna Korhonen , @barbaraplank.bsky.social
aclanthology.org/2026.eacl-lo...

4 weeks ago 1 0 1 0
Advertisement

Too open for opinion? Embracing open-endedness in large language models for social simulation
[Oral]: Wed. 25 Mar - S. Walil @ 14:30-16:00
@boleima.bsky.social , @yongcao.bsky.social , Indria Sen, Anna-Carolina Haensch, Frauke Kreuter, @barbaraplank.bsky.social , @danielhers.bsky.social

4 weeks ago 0 0 1 0

Persistent Personas? Role-Playing, Instruction Following, and Safety in Extended Interactions
[Oral]: Wed. 25 Mar - S. Le Riad @ 11:30-13:00
@pedrohluzaraujo.bsky.social , @mhedderich.bsky.social , @amodarressi.bsky.social , Hinrich SchΓΌtze, Benjamin Roth
bsky.app/profile/pedr...

4 weeks ago 1 0 1 1
Post image

Going to Rabat for #EACL2026? So are we! πŸ‡²πŸ‡¦
We are bringing a packed schedule of papers, talks, and workshops.
Check out our lineup below and come say hi! πŸ‘‹ 🧡
#NLProc @eaclmeeting.bsky.social

4 weeks ago 4 2 1 0
Post image

πŸ“’ Life update πŸ“’

After a wonderful time at @ai2.bsky.social, I've joined @cislmu.bsky.social at @lmu.de as a tenure-track assistant professor in NLP. Thrilled to be back in Europe and to start a lab in Munich's flourishing AI ecosystem! πŸŽ‰

1 month ago 29 1 2 0
Piper title ("A multi-dialectal dataset for German dialect ASR and dialect-to-standard speech translation") and a map of the German state Bavaria showing where the Franconian, Bavarian, and Alemannic dialect groups are spoken

Piper title ("A multi-dialectal dataset for German dialect ASR and dialect-to-standard speech translation") and a map of the German state Bavaria showing where the Franconian, Bavarian, and Alemannic dialect groups are spoken

At #Interspeech2025 I'm going to present Betthupferl, a dataset for German dialect ASR & dialect-to-standard speech translation! We analyze differences between dialectal & Standard German transcriptions, benchmark ASR models, and examine shortcomings of current ASR models & evaluation metrics.

8 months ago 16 4 1 1
Post image

I’ll be at @icmlconf.bsky.social next week presenting NoLiMa!
Poster on Tue July 15, 4:30–7pm (E-2312).

Happy to grab a coffee and chat about long-context, memory, research, or just to catch up.

I’ll be in Toronto for a couple of days after the conference, let me know if you’re around!

9 months ago 4 2 1 0
Post image

New paper: How does pretraining on programming languages + English shape LLMs' concept space?
πŸ” Do LLMs use English or a programming language as a kind of pivot language?
🧠 Are neurons language-specific or shared across programming languages and English?
πŸ”— arxiv.org/abs/2506.01074

10 months ago 6 1 1 0

πŸ“„ Collapse of Dense Retrievers

Accepted to #ACL2025 main conference πŸŽ‰πŸŽ‰

In this paper we uncover major vulnerabilities in dense retrievers like Contriever, showing they favor:
πŸ“Œ Shorter docs
πŸ“Œ Early positions
πŸ“Œ Repeated entities
πŸ“Œ Literal matches
...all while ignoring the answer's presence!

11 months ago 9 2 1 1

πŸ—¨οΈ Beyond β€œnoisy” text: How (and why) to process dialect data
πŸ”Ž Keynote talk at WNUT @ NAACL
πŸ‘₯ @verenablaschke.bsky.social
πŸ“ Workshop on noisy and user-generated text (May 3)
The full workshop programme is here: noisy-text.github.io/2025/
bsky.app/profile/vere...

11 months ago 2 1 0 0

πŸ“ Privacy-Preserving Federated Learning for Hate Speech Detection
πŸ”Ž We present a federated learning system with differential privacy and fine-tuned ALBERT models for low-resource hate speech detection.
πŸ‘₯ Ivo JΓΊnior, @htyeh1, Axel Wisiorek, @HinrichSchuetze
πŸ“ SRW - Long

11 months ago 1 0 1 0
Advertisement

πŸ“ Linguistic Features in German BERT: The Role of Morphology, Syntax, and Semantics in Multi-Class Text Classification
πŸ”Ž Analysis of linguistic features used by German BERT in a classification task.
πŸ‘₯ Henrike Beyer (University of Dundee), Diego Frassinelli
πŸ“ SRW - Short

11 months ago 0 0 1 0
Preview
XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples Recent studies indicate that leveraging off-the-shelf or fine-tuned retrievers, capable of retrieving relevant in-context examples tailored to the input query, enhances few-shot in-context learning of...

πŸ“ XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples
πŸ”Ž a simple yet effective method to retrieve cross-lingual few-shot examples for multilingual in-context learning
πŸ‘₯ @lpq29743, @andre_t_martins, @HinrichSchuetze
πŸ”— arxiv.org/abs/2405.05116
πŸ“ Finding - Short

11 months ago 0 0 1 0
Preview
Dialetto, ma Quanto Dialetto? Transcribing and Evaluating Dialects on a Continuum There is increasing interest in looking at dialects in NLP. However, most work to date still treats dialects as discrete categories. For instance, evaluative work in variation-oriented NLP for English...

πŸ“ Dialetto, ma Quanto Dialetto? Transcribing and Evaluating Dialects on a Continuum
πŸ”Ž We predict speech-to-text model performance on dialect continua with geostatistics.
πŸ‘₯ Ryan Soh-Eun Shim, Barbara Plank
πŸ”— arxiv.org/abs/2410.14589
πŸ“Findings - Long

11 months ago 0 0 1 0
Preview
A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models Recent studies have highlighted the potential of exploiting parallel corpora to enhance multilingual large language models, improving performance in both bilingual tasks, e.g., machine translation, an...

πŸ“ A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models
πŸ”ŽAn investigation of the impact of parallel corpora, ... on the performance of multilingual LLMs.
πŸ‘₯ @lpq29743, @andre_t_martins, @HinrichSchuetze
πŸ”— arxiv.org/abs/2407.00436
πŸ“Finding - Long

11 months ago 1 0 1 0
Post image

πŸ₯³ We are happy to share that CIS will be presenting 6 papers and talks at #NAACL2025!
Find out about each of them below in the 🧡

11 months ago 10 0 1 1

On my way to #NAACL2025 where I'll give a keynote at the noisy text workshop (WNUT), presenting some of the challenges & methods for dialect NLP + also discussing dialect speakers' perspectives!

πŸ—¨οΈ Beyond β€œnoisy” text: How (and why) to process dialect data
πŸ—“οΈ Saturday, May 3, 9:30–10:30

11 months ago 27 7 1 1