BPE-knockout just got outperformed by an algorithm that modifies BPE tokenisers in a feedback loop to make them absorb more and more constraints. It doesn't even need more data to do that. It uses the tokeniser itself as a dataset. 🧵
Posts by Miryam de Lhoneux
Right, still, this is insane
(If I were a reviewer for TX and received such a DR, I would immediately withdraw from the reviewer pool)
yikes, if X is what I think it is, TX is a sinking ship
If you're reviewing ARR papers and want a tool to help you spot potential hallucinated references, I cooked this up for the ACL SACs and thought I would share it with the broader community github.com/davidjurgens...
I'm not there unfortunately but @wpoelman.bsky.social and Thomas Bauwens are
LAGoM will present several papers at #EACL2026 in Rabat next week! Our work at this year’s conference spans tokenisation, multilingual evaluation, and model design.
📢I'm organizing a BoF session at #EACL2026 called Tokenization & Beyond, aiming to gather researchers exploring tokenization and alternatives such as byte-level and pixel-based approaches. Sign up using the form if you're interested! #NLProc @eaclmeeting.bsky.social
📣DEADLINE EXTENDED!📣
Need a few more days to perfect your paper for #EAMT2026? You got it.
We have pushed the submission deadline back!
🗓️ New Deadline: 25th March 2026 23:59 CEST
Breathe, revise, and submit: easychair.org/my/conferenc...
And don't forget to anonymize your paper 👀🕵️♀️
⏳ ONE WEEK LEFT!
The #EAMT2026 submission deadline is closing next Wednesday (March 18).
Whether it's a deep-dive into LLM evaluations, low-resource MT, or a new user study, we want to see it. Let’s get those papers in! 🏃♂️💨
🔗 Submission info here: eamt2026.org/calls-for-pa...
👇
i'm doing it - i'm writing an MT proposal 👀
oof. I set a max load to 30 because in the last ARR cycle I had 60, and I had an AC for 15 submissions who did nothing at all (until 5 days past the meta-review ddl, they submitted meta-reviews for papers for which i already had an emergency AC who had written a meta-review)
good morning to you and to my ACs who submitted all their meta-reviews before the deadline
As someone with degrees in both, this is spot on 🎯
Aujourd'hui une étudiante bullet points m'a dit que pour un autre exam, le.a prof a dit justement qu'iel voulait des listes et pas des phrases complètes. Pour un exam open book. Je comprends pas. Mais donc les collègues empirent le truc apparemment
Hi #NLP Bluesky, the Multilinguality track at #ACL2026NLP @aclmeeting.bsky.social needs emergency reviewers. If you can complete one or two reviews before February 15, please reach out. Thank you!
I think something is particularly wrong this cycle, I've heard multiple stories of people being assigned more than their load even a case of someone whose load was supposed to be 0. Don't know if bug or intentional
thx, already saved this to my zotero and plan to read it soon! :) looks like very interesting work!
Je suis contente de pas être dans le système francophone juste pour ça
A photograph of sunny Copenhagen in the summer!
📢 I am hiring a highly-motivated Ph.D student at the University of Copenhagen to work on tokenization-free NLP.
Read our previous work in this topic: aclanthology.org/2025.emnlp-m...
aclanthology.org/2023.emnlp-m...
openreview.net/forum?id=FkS...
Apply by March 8: employment.ku.dk/phd/?show=1563
New EACL paper (with @mdlhx.bsky.social)! We tested if comparing perplexity of parallel data across languages is fair. Turns out: it depends. We show the choice of test set (even with consistent meaning) can flip conclusions about which language is easier to model.
Paper: arxiv.org/abs/2601.10580
👀
Today, the ACL Anthology switched to a new system for how author pages work. From now on, ORCID iDs will be the main mechanism for matching papers to the correct author. 🧵⤵️
Reminder that we have an alt-ARR slack workspace where ACs and SACs can support each other through the sometimes confusing process of the ARR cycle! Post or DM me a good email address for a Slack invitation and I will add you. #ACL2026
J'étais pas sérieuse hehe. Je crois que c'est juste inévitable. Mes exams sont open-book. Je le dis répétitivement en cours, je le dis sur un pdf d'info sur l'exam, c'est écrit sur la page officielle du cours. J'ai encore des étudiants qui écrivent sur leur copie "I didn't know it was open book"
"Think about it step by step"
Je crois qu'il faut aussi écrire avant chaque question, lis les instructions d'abord. Peut-être même les faire réécrire les instructions avant de répondre aux questions
Master et master "avancé"
C'est ce que je fais et pourtant chaque année j'ai des étudiants qui écrivent quand-même des listes
Congratulations Dr.!!!! 🎉