We are reopening the interviews for this PhD position. Please help me spread the word to find the right potential candidates!
Posts by Paolo Papotti
main architecture
We introduce
- Query planning as constrained optimization over quality constraints and cost objective
- Gradient-based optimization to jointly choose operators and allocate error budgets across pipelines
- KV-cache–based operators to turn discrete physical choices into a runtime-quality continuum
Co-authors: Gabriele Sanmartino, Matthias Urban, Paolo Papotti, Carsten Binnig
This is the first outcome of our collaboration with Technische Universität Darmstadt within the @agencerecherche.bsky.social / @dfg.de ANR/DFG #Magiq project - more to come!
plots of results
Empirically, Stretto delivers 2x-10x faster execution 🔥 across various datasets and queries compared to prior systems that meet quality guarantees.
Stretto paper on arxiv
🚀 New: The Stretto Execution Engine for LLM-Augmented Data Systems.
LLM operators create a runtime ↔ accuracy trade-off in query execution. We address it with a novel optimizer, for end-to-end quality guarantees, and new KV-cache–based operators, for efficiency.
arxiv.org/abs/2602.04430
Details👇
Happy Fontaines D.C.'s fan from the last album (2024). But the real treat was discovering the previous ones!
I d also like to test it, thanks!
I agree. Here is another trick for input context we recently published
bsky.app/profile/papo...
These results point toward models that decide which retrieved document to trust, turning “context engineering” from a static prompt recipe into a dynamic decoding policy.
Amazing work from Giulio Corallo in his industrial PhD at SAP!
Key insight: 𝐄𝐯𝐢𝐝𝐞𝐧𝐜𝐞 𝐚𝐠𝐠𝐫𝐞𝐠𝐚𝐭𝐢𝐨𝐧 𝐡𝐚𝐩𝐩𝐞𝐧𝐬 𝐚𝐭 𝐝𝐞𝐜𝐨𝐝𝐢𝐧𝐠 𝐭𝐢𝐦𝐞, the model can effectively “switch” which document drives each token - without cross-document attention!
📈 Results: PCED often matches (and sometimes beats) long-context concatenation, while dramatically outperforming KV merge baseline on multi-doc QA/ICL.
🚀 Systems win: ~180× faster time-to-first-token vs long-context prefill using continuous batching and Paged Attention.
Instead of concatenating docs into one context (slow, noisy attention), training-free PCED:
● Keeps each document as its own 𝐞𝐱𝐩𝐞𝐫𝐭 with independent KV cache
● Runs experts in 𝐩𝐚𝐫𝐚𝐥𝐥𝐞𝐥 to get logits
● Selects next token with a 𝐫𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥-𝐚𝐰𝐚𝐫𝐞 𝐜𝐨𝐧𝐭𝐫𝐚𝐬𝐭𝐢𝐯𝐞 𝐝𝐞𝐜𝐨𝐝𝐢𝐧𝐠 rule integrating scores as a prior
🛑 𝐒𝐭𝐨𝐩 𝐭𝐡𝐫𝐨𝐰𝐢𝐧𝐠 𝐚𝐰𝐚𝐲 𝐲𝐨𝐮𝐫 𝐫𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥 𝐬𝐜𝐨𝐫𝐞𝐬.
RAG uses embedding scores to pick Top-K, then treat all retrieved chunks as equal.
Parallel Context-of-Experts Decoding (PCED) uses retrieval scores to move evidence aggregation from attention to decoding.
🚀 180× faster time-to-first-token!
New PhD position on Tool-Augmented LLMs for Enterprise Data AI 🚨
Starting in early 2026 under my academic supervision and hosted by the fantastic team at AILY LABS in Madrid or Barcelona
Details reported in the link - please ping me for any question!
www.linkedin.com/jobs/view/43...
Thumbnail: Accelerating Tabular Inference: Training Data Generation with TENET
Vol:18 No:12 → Accelerating Tabular Inference: Training Data Generation with TENET
👥 Authors: Enzo Veltri, Donatello Santoro, Jean-Flavien Bussotti, Paolo Papotti
📄 PDF: https://www.vldb.org/pvldb/vol18/p5303-veltri.pdf
Can We Trust the Judges? This is the question we asked in validating factuality evaluation methods via answer perturbation. Check out the results at the #EvalLLM2025 workshop at #TALN2025
Blog: giovannigatti.github.io/trutheval/
Watch: www.youtube.com/watch?v=f0XJ...
Play: github.com/GiovanniGatt...
Kudos to my amazing co-authors Dario Satriani, Enzo Veltri, Donatello Santoro! Another great collaboration between Università degli Studi della Basilicata and EURECOM 🙌
#LLM #Factuality #Benchmark #RelationalFactQA #NLP #AI
Structured outputs power analytics, reporting, and tool-augmented agents. This work exposes where current LLMs fall short and offers a clear tool for measuring progress on factuality beyond single-value QA. 📊
We release a new factuality benchmark with 696 annotated natural-language questions paired with gold factual answers expressed as tables (avg. 27 rows × 5 attributes), spanning 9 knowledge domains, with controlled question complexity and rich metadata.
Our new paper, "RelationalFactQA: A Benchmark for Evaluating Tabular Fact Retrieval from Large Language Models", measures exactly this gap.
Wider or longer output tables = tougher for all LLMs! 🧨
From Llama 3 and Qwen to GPT-4, no LLM goes above 25% accuracy on our stricter measure.
Ask any LLM for a single fact and it’s usually fine.
Ask it for a rich list and the same fact is suddenly missing or hallucinated because the output context got longer 😳
LLMs exceed 80% accuracy on single-value questions but accuracy drops linearly with the # of output facts
New paper, details 👇
and a special thanks to
@tanmoy-chak.bsky.social for leading this effort!
More co-authors here on bsky
@iaugenstein.bsky.social
@preslavnakov.bsky.social
@igurevych.bsky.social
@emilioferrara.bsky.social
@fil.bsky.social
@giovannizagni.bsky.social
@dcorney.com
@mbakker.bsky.social
@computermacgyver.bsky.social
@irenelarraz.bsky.social
@gretawarren.bsky.social
It’s time we rethink how "facts" are negotiated in the age of platforms.
Excited to hear your thoughts!
#Misinformation #FactChecking #SocialMedia #Epistemology #HCI #DigitalTruth #CommunityNotes
arxiv.org/pdf/2505.20067
Community-based moderation offers speed & scale, but also raises tough questions:
– Can crowds overcome bias?
– What counts as evidence?
– Who holds epistemic authority?
Our interdisciplinary analysis combines perspectives from HCI, media studies, & digital governance.
Platforms like X are outsourcing fact-checking to users via tools like Community Notes. But what does this mean for truth online?
We argue this isn’t just a technical shift — it’s an epistemological transformation. Who gets to define what's true when everyone is the fact-checker?
🚨 𝐖𝐡𝐚𝐭 𝐡𝐚𝐩𝐩𝐞𝐧𝐬 𝐰𝐡𝐞𝐧 𝐭𝐡𝐞 𝐜𝐫𝐨𝐰𝐝 𝐛𝐞𝐜𝐨𝐦𝐞𝐬 𝐭𝐡𝐞 𝐟𝐚𝐜𝐭-𝐜𝐡𝐞𝐜𝐤𝐞𝐫?
new "Community Moderation and the New Epistemology of Fact Checking on Social Media"
with I Augenstein, M Bakker, T. Chakraborty, D. Corney, E
Ferrara, I Gurevych, S Hale, E Hovy, H Ji, I Larraz, F
Menczer, P Nakov, D Sahnan, G Warren, G Zagni
🌟 New paper alert! 🌟
Our paper, "Retrieve, Merge, Predict: Augmenting Tables with Data Lakes", has been published in TMLR!
In this work, we created YADL (a semi-synthetic data lake), and we benchmarked methods for augmenting user-provided tables given information found in data lakes.
1/
Thanks for the amazing work to the whole team!
Joint work between Università degli Studi della Basilicata (Enzo Veltri, Donatello Santoro, Dario Satriani) and EURECOM (Sara Rosato, Simone Varriale).
#SQL #DataManagement #QueryOptimization #AI #LLM #Databases #SIGMOD2025