Jaap Jumelet (@jumelet) Bsky

I’m very happy to share that my latest paper on the 𝐚𝐜𝐪𝐮𝐢𝐬𝐢𝐭𝐢𝐨𝐧 𝐨𝐟 𝐯𝐞𝐫𝐛 𝐦𝐞𝐚𝐧𝐢𝐧𝐠, tested on models trained under CDL data vs ADL data , has been accepted for an 𝐨𝐫𝐚𝐥 𝐩𝐫𝐞𝐬𝐞𝐧𝐭𝐚𝐭𝐢𝐨𝐧 at the upcoming edition of 𝐂𝐨𝐠𝐒𝐜𝐢, which will take place at the end of July in Rio de Janeiro.💃

1 week ago 19 5 1 0

Assessing the Impact of Typological Features on Multilingual Machine Translation in the Age of Large Language Models Despite major advances in multilingual modeling, large quality disparities persist across languages. Besides the obvious impact of uneven training resources, typological properties have also been prop...

Our paper has been accepted to #EACL2026 main conference!

Together with @jumelet.bsky.social and @arianna-bis.bsky.social, we study the effect of target language typology on the difficulty of state-of-the-art neural machine translation.

arXiv preprint: arxiv.org/abs/2602.03551

1/6 Our findings ⬇️

2 months ago 15 1 1 1

Natural Language Processing How do you build Large Language Models? How do humans experience Natural Language Processing (NLP) applications in their daily lives? And how can we...

👀 Look what 🎅 has broght just before Christmas 🎁: a brand new Research Master in Natural Language Processing at @facultyofartsug.bsky.social @rug.nl

Program: www.rug.nl/masters/natu...

Applications (2026/2027) are open! Come and study with us (you will also learn why we have a 🐮 in our logo)

4 months ago 25 15 0 0

🧑‍🔬I’m recruiting PhD students in Natural Language Processing @unileipzig.bsky.social Computer Science, together with @scadsai.bsky.social!

Topics include, but aren’t limited to:

🔎Linguistic Interpretability
🌍Multilingual Evaluation
📖Computational Typology

Please share!

#NLProc #NLP

4 months ago 41 25 1 3

📢Out now in NEJLT!📢

In each of these sentences, a verb that doesn't usually encode motion is being used to convey that an object is moving to a destination.

Given that these usages are rare, complex, and creative, we ask:

Do LLMs understand what's going on in them?

🧵1/7

5 months ago 15 3 2 0

Screenshot of a figure with two panels, labeled (a) and (b). The caption reads: "Figure 1: (a) Illustration of messages (left) and strings (right) in toy domain. Blue = grammatical strings. Red = ungrammatical strings. (b) Surprisal (negative log probability) assigned to toy strings by GPT-2."

New work to appear @ TACL!

Language models (LMs) are remarkably good at generating novel well-formed sentences, leading to claims that they have mastered grammar.

Yet they often assign higher probability to ungrammatical strings than to grammatical strings.

How can both things be true? 🧵👇

5 months ago 92 21 2 3

I'm in Suzhou to present our work on MultiBLiMP, Friday @ 11:45 in the Multilinguality session (A301)!

Come check it out if your interested in multilingual linguistic evaluation of LLMs (there will be parse trees on the slides! There's still use for syntactic structure!)

arxiv.org/abs/2504.02768

5 months ago 27 7 0 0

accepted papers at main conference and findings

accepted papers at TACL and workshops

With only a week left for #EMNLP2025, we are happy to announce all the works we 🐮 will present 🥳 - come and say "hi" to our posters and presentations during the Main and the co-located events (*SEM and workshops) See you in Suzhou ✈️

5 months ago 16 5 0 3

BabyBabelLM

For more information check out the website, paper, and datasets:

Website: babylm.github.io/babybabellm/
Paper: arxiv.org/pdf/2510.10159

We hope BabyBabelLM will continue as a 'living resource', fostering both more efficient NLP methods, and opening ways for cross-lingual computational linguistics!

6 months ago 1 0 0 0

Next to our training resources, we also release an evaluation pipeline that assess different aspects of language learning.

We present results for various simple baseline models, but hope this can serve as a starting point for a multilingual BabyLM challenge in future years!

6 months ago 0 0 1 0

To deal with data imbalances, we divide languages into three Tiers. This better enables cross-lingual studies and makes it possible for low-resource languages to be a part of BabyBabelLM as well.

6 months ago 0 0 1 0

With a fantastic team of international collaborators we have developed a pipeline for creating LM training data from resources that children are exposed to.

We release this pipeline and welcome new contributions!

Website: babylm.github.io/babybabellm/
Paper: arxiv.org/pdf/2510.10159

6 months ago 1 1 1 0

🌍Introducing BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data!

LLMs learn from vastly more data than humans ever experience. BabyLM challenges this paradigm by focusing on developmentally plausible data

We extend this effort to 45 new languages!

6 months ago 44 16 1 4

Wij speelden als kind (in Breda) vaak "1 keer tets", waar je een voetbal maximaal 1 keer mocht laten stuiteren; ik had ook geen idee dat dat een Brabants woord was.

7 months ago 1 0 0 0

Happening now at the SIGTYP poster session! Come talk to Leonie and me about MultiBLiMP!

8 months ago 20 2 1 0

I'll be in Vienna only from tomorrow, but today my star PhD student Marianne is already presenting some of our work:

BLIMP-NL, in which we create a large new dataset for syntactic evaluation of Dutch LLMs, and learn a lot about dataset creation, LLM evaluation and grammatical abilities on the way.

8 months ago 11 1 1 1

Congrats and good luck in Canada!

9 months ago 1 0 0 0

TurBLiMP: A Turkish Benchmark of Linguistic Minimal Pairs We introduce TurBLiMP, the first Turkish benchmark of linguistic minimal pairs, designed to evaluate the linguistic abilities of monolingual and multilingual language models (LMs). Covering 16 linguis...

Proud to introduce TurBLiMP, the 1st benchmark of minimal pairs for free-order, morphologically rich Turkish language!

Pre-print: arxiv.org/abs/2506.13487

Fruit of an almost year-long project by amazing MS student @ezgibasar.bsky.social in collab w/ @frap98.bsky.social and @jumelet.bsky.social

10 months ago 11 2 1 3

Ik snap niet dat hier niet meer ophef over is:

Het binnenhalen van Amerikaanse wetenschappers wordt betaalt door Nederlandse academici geen inflatiecorrectie op hun salaris te geven.

1/2

10 months ago 65 35 5 7

Ohh cool! Nice to see the interactions-as-structure idea I had back in 2021 is still being explored!

10 months ago 3 0 0 0

My paper with @tylerachang.bsky.social and @jamichaelov.bsky.social will appear at #ACL2025NLP! The updated preprint is available on arxiv. I look forward to chatting about bilingual models in Vienna!

10 months ago 8 2 1 1

Child-Directed Language Does Not Consistently Boost Syntax Learning in Language Models Seminal work by Huebner et al. (2021) showed that language models (LMs) trained on English Child-Directed Language (CDL) can reach similar syntactic abilities as LMs trained on much larger amounts of ...

“Child-Directed Language Does Not Consistently Boost Syntax Learning in Language Models”

I’m happy to share that the preprint of my first PhD project is now online!

🎊 Paper: arxiv.org/abs/2505.23689

10 months ago 61 17 2 3

University: a good idea | Patrick Porter | The Critic Magazine A former student of mine has penned an attack on universities, derived from their own disappointing experience studying Politics and International Relations at the place where I ply my trade. In short...

"A well-delivered lecture isn’t primarily a delivery system for information. It is an ignition point for curiosity, all the better for being experienced in an audience."

Marvellous defence of the increasingly maligned university experience by @patporter76.bsky.social
thecritic.co.uk/university-a...

10 months ago 58 19 0 2

Stellen OBP - Georg-August-Universität Göttingen Webseiten der Georg-August-Universität Göttingen

Interested in multilingual tokenization in #NLP? Lisa Beinborn and I are hiring!

PhD candidate position in Göttingen, Germany: www.uni-goettingen.de/de/644546.ht...

PostDoc position in Leuven, Belgium:
www.kuleuven.be/personeel/jo...

Deadline 6th of June

11 months ago 25 13 2 2

BlackboxNLP, the leading workshop on interpretability and analysis of language models, will be co-located with EMNLP 2025 in Suzhou this November! 📆

This edition will feature a new shared task on circuits/causal variable localization in LMs, details here: blackboxnlp.github.io/2025/task

11 months ago 21 8 3 4

Close your books, test time!
The evaluation pipelines are out, baselines are released & the challenge is on

There is still time to join and
We are excited to learn from you on pretraining and human-model gaps

*Don't forget to fastEval on checkpoints
github.com/babylm/evalu...
📈🤖🧠
#AI #LLMS

11 months ago 10 4 0 0

Can LLMs Really Learn to Translate a Low-Resource Language from One Grammar Book? Extremely low-resource (XLR) languages lack substantial corpora for training NLP models, motivating the use of all available resources such as dictionaries and grammar books. Machine Translation from ...

Pleased to announce our paper was accepted at ICLR 2025 as a Spotlight! I will present our poster on Saturday April 26, 3-5pm, Poster #241. Hope to see you there!
arxiv.org/abs/2409.19151

11 months ago 17 3 0 1

Scherp geschreven en geheel mee eens, maar beetje wrang wel dat de boodschap zich achter een paywall van 450 euro bevindt :') (dank voor de screenshots!)

11 months ago 1 0 1 0

✨ New Paper ✨
[1/] Retrieving passages from many languages can boost retrieval augmented generation (RAG) performance, but how good are LLMs at dealing with multilingual contexts in the prompt?

📄 Check it out: arxiv.org/abs/2504.00597
(w/ @arianna-bis.bsky.social @Raquel_Fernández)

#NLProc

1 year ago 4 5 1 1

That is definitely possible indeed, and a potential confounding factor. In RuBLiMP, a Russian benchmark, they defined a way to validate this based on LM probs, but we left that open for future work. The poor performance on low-res langs shows they're definitely not trained on all of UD though!

1 year ago 1 0 1 0

Posts by Jaap Jumelet