▪️Seven papers authored or co-authored by UKP staff have been accepted for publication at this year's #EACL2024
▪️ Five papers have been accepted for publication at this year's #NAACL2024, the Conference of the North American Chapter of the ACL.
(3/🧵)
Looking for potential emergency reviewers for submissions in Interpretability and Model Analysis for #NAACL2024!
Topics include: LLM Hallucination, Evaluation, Privacy Policy Analysis, and Unlearning. Please reach out if you have the bandwidth to help!🫡 #NLProc
And if you are currently at #NAACL2024 & would like to learn more about the dataset, consider checking out Pia Pachingers presentation at the "Workshop on Online Abuse and Harms" on "Span-Based Austrian German & English Offensive Language Detection" tomorrow 🙂🙌
Joint work of Nico Daheim,¹ Nouha Dziri,² Mrinmaya Sachan,³ Iryna Gurevych¹ and Edoardo Ponti.⁴
________________
¹ @ukplab.bsky.social, @cs-tudarmstadt.bsky.social, hessian.AI
² Allen Institute for AI (AI2)
³ ETH Zürich
⁴ The University of Edinburgh
See you in Mexico City 🇲🇽 at #NAACL2024! (9/9)
⚖️ Trade-off between faithfulness and abstractiveness
📈 Results on further tasks, such as FaithDial
🧑⚖️ Human Evaluation
For further results, check our paper and code!
📄 Paper: arxiv.org/abs/2303.17574
💻 Code: github.com/UKPLab/naacl...
(8/🧵) #NAACL2024
Adding the abstractiveness expert can improve the baseline in terms of both faithfulness and abstractiveness (gray region on the chart)
(7/🧵) #NAACL2024
Subtracting the hallucination mitigation expert trained on hallucinated examples removes these examples from training data
This drastically reduces hallucinations 📉
And it does so better than other methods 🏆
(6/🧵) #NAACL2024
We improve on scalar weighting!
How? By weighting each model according to their Fisher Information Matrix ⚖️
It provides a parameter-specific scaling that can better isolate parameters responsible for hallucinations and abstractiveness.
(5/🧵) #NAACL2024
We train 2 experts :
💠 a hallucination mitigation expert to discourage hallucinations
💠 an abstractiveness expert to encourage naturalness
(4/🧵) #NAACL2024
Adding them to the model weights promotes the behavior of the finetuned model ✅
Subtracting them discourage that behavior ❌
(3/🧵) #NAACL2024
Our method builds on Task Arithmetic 🏗️
Task vectors impact the model behavior. They are the difference between the model weights after and before fine-tuning on a task.
(2/🧵) #NAACL2024
Dialog models often hallucinate 😵💫
➕ Knowledge grounding can help
➖ but the responses become less natural
Can we reduce hallucinations AND keep naturalness?
Yes 🚀 With Elastic Weight Removal (EWR)!
Learn more about our #NAACL2024 paper 🧵 (1/9)
📃 arxiv.org/abs/2303.17574
Joint work of Chen Cecilia Liu, Jonas Pfeiffer (Google DeepMind), Ivan Vulić (Language Technology Lab, University of Cambridge) and Iryna Gurevych (@ukplab.bsky.social)
We look forward to seeing you in 🇲🇽!
(8/8) #NAACL2024 #NLProc
Do scheduled unfreezing algorithms only work with task adapters?
Of course not, they work with LoRAs too!
Learn more in our paper:
📄 arxiv.org/abs/2301.05487
(7/🧵) #NAACL2024 #NLProc
💡Our takeaway: deciding which parameters to train is important, even with adapters!
Give it a try in other OOD settings and let us know about your results 🙏
(6/🧵) #NAACL2024 #NLProc
What if we select the training order of the adapters based on their Fisher Information?
We introduce FUN (Fisher-UNfreezing): at each step, we unfreeze the adapter with the largest Fisher Information 🥶🔥
The results are comparable to GU & LPFT!
(5/🧵) #NAACL2024 #NLProc
Scheduled unfreezing methods influence the learning dynamics in a unique way.
Why does this work so well?
👀 Look at Fisher Information! It changes during the critical learning period of training.
(4/🧵) #NAACL2024 #NLProc
Using GU and LPFT with task adapters provides better zero-shot cross-lingual transfer results than standard training and bridges the gap to full parameters tuning 🌍
(3/🧵) #NAACL2024 #NLProc
How to prevent catastrophic forgetting and improve OOD generalization?
We test the cross-lingual transfer abilities of »scheduled unfreezing« methods like Gradual Unfreezing (GU) and LPFT
(2/🧵) #NAACL2024 #NLProc
Cross-lingual zero-shot transfer is a natural test for out-of-distribution generalization, isn’t it? 😉
Our #NAACL2024 paper proposes »Scheduled unfreezing« ❄️🔥 for cross-lingual adapters training! And it works for LoRA ✨ (1/🧵)
📄 arxiv.org/abs/2301.05487
#NLProc
You can find our contribution here:
📄 Paper: arxiv.org/abs/2309.08591
Joint work of Chen Cecilia Liu, Fajri Koto, Timothy Baldwin and Iryna Gurevych of MBZUAI and @ukplab.bsky.social / @tuda.bsky.social.
See you later at #NAACL2024 in Mexico City 🇲🇽! (7/7)
No doubt we need reasoning abilities—but we also need Cultural Awareness!
➡️ Let’s build a truly Inclusive and Culturally Competent #NLProc for PEOPLE.
(6/🧵) #CulturalCompetence #InclusiveNLP #NAACL2024
Translation adds a layer of complexity.
When proverbs hop from one language to another, mLLMs reveal a »culture gap«
Translating conversations with proverbs is not just switching the language; they're crossing cultural boundaries, which isn't always smooth sailing ⛵
(5/🧵) #NAACL2024 #NLProc
🚨 It’s 2024 and the issue of negation is still not fixed !
While open source mLLMs score high in so many reasoning tasks, they still lack this basic reasoning ability 🙀. The bigger the model, the worse the results…
(4/🧵) #NAACL2024 #NLProc
⚠️ Memorization ≠ Understanding !
mLLMs "know" many proverbs, but knowing isn't understanding, especially in conversations🙅
Understanding abilities improve with scale and possibly instruction tuning.
(3/🧵) #NAACL2024 #NLProc
We dive into the performance of mLLMs on proverbs and sayings, the tricky bits of language that carry cultural meanings with contextual dependence.
With our new dataset MAPS 🗺, we show that different cultures naturally care about different topics. Then, we found this 👇
(2/🧵) #NAACL2024 #NLProc
🧑🎓💬 »I want to work on multicultural NLP and pragmatic reasoning.«
🧑🏫💬 »Fortune favours the brave!«
Ever wondered what 🧑🏫meant?
At #NAACL2024, we investigate multicultural proverbs and sayings with multilingual LLMs! – more in this 🧵 (1/7)
📄 arxiv.org/abs/2309.08591
And consider following the authors Sheng Lu, Hendrik Schuff, and Iryna Gurevych (@ukplab.bsky.social) if you are interested in more information or an exchange of ideas. (9/9) #NAACL2024
See you in Mexico City 🇲🇽!
We provide open access to our code and results:
📄 Paper: arxiv.org/abs/2311.07230
💻 Code: github.com/UKPLab/naacl...
(8/🧵) #NAACL2024 #NLProc