Advertisement Β· 728 Γ— 90

Posts by Juraj Vladika

Post image

Delighted to share "Facts Fade Fast: Evaluating Memorization of Outdated Medical Knowledge in LLMs", accepted to Findings of #EMNLP2025 !🐼

Wit a novel dataset of changed medical knowledge, we discover the alarming presence of obsolete advice in eight popular LLMs.βŒ›

πŸ“: arxiv.org/abs/2509.04304 #NLP

7 months ago 9 0 0 0
line diagram showing the RAG performance of different base LLM models

line diagram showing the RAG performance of different base LLM models

Also happy to share that β€œOn the Influence of Context Size and Model Choice in RAG Systems” was accepted to Findings of #NAACL2025! πŸ‡ΊπŸ‡ΈπŸœοΈ

We test how the RAG performance on QA tasks changes (and plateaus) with increasing context size across different LLMs and retrievers.
πŸ“ arxiv.org/abs/2502.14759

1 year ago 8 0 0 0
architecture of the step-by-step fact verification system

architecture of the step-by-step fact verification system

Thrilled to share that "Step-by-Step Fact Verification for Medical Claims with Explainable Reasoning" was accepted to #NAACL2025! πŸ‡ΊπŸ‡ΈπŸœοΈ

This system iteratively collects new knowledge via generated Q&A pairs, making the verification process more robust and explainable.
πŸ“œ arxiv.org/abs/2502.14765 #NLP

1 year ago 6 0 0 0

More than 8500 submissions to ACL 2025 (ARR February 2025 cycle)! That is an increase of 3000 submissions compared to ACL 2024. It will be a fun reviewing period. πŸ˜…πŸ’―
@aclmeeting.bsky.social #ACL2025 #ACL2025nlp #NLP

1 year ago 20 5 1 4

Most exciting update to encoder-only models in a long time! Love to use them for classification tasks where LLMs are an overkill #ModernBERT

1 year ago 4 0 1 0
Post image Post image Post image Post image

Organizing hackaTUM 2024 was an incredible experience!

Around 1000 participants with 3 days full of intense coding, new experiences, exciting sponsor challenges and workshops, fun side activities, tasty food, creative final solutions, and overall awesome fun! 😊

Join us next year πŸ’™πŸ§‘β€πŸ’»πŸ”œ hack.tum.de

1 year ago 2 0 0 0
The OLMo 2 models sit at the Pareto frontier of training FLOPs vs model average performance.

The OLMo 2 models sit at the Pareto frontier of training FLOPs vs model average performance.

Meet OLMo 2, the best fully open language model to date, including a family of 7B and 13B models trained up to 5T tokens. OLMo 2 outperforms other fully open models and competes with open-weight models like Llama 3.1 8B β€” As always, we released our data, code, recipes and more 🎁

1 year ago 151 36 5 12
Advertisement

I have started taking screenshots of interesting posts instead, but that gets hard to track after a while. πŸ₯²

1 year ago 1 0 0 0

Thank you for the list! I would appreciate being added. 😊

1 year ago 1 0 0 0

Using that "other" NLP is fun for trying to convince your reviewers to increase the scores :))

1 year ago 0 0 0 0

Congratulations! πŸ‘ I will definitely read it

1 year ago 1 0 1 0