Advertisement Β· 728 Γ— 90

Posts by Cornelius Wolff

Yes, sometimes I feel bold and press the "allow all commands in this session" button and just hope that Claude doesn't shoot half of my code base out of existence in an attempt to fix an error itself introduced...

2 months ago 0 0 1 0

Excited to introducd the SQaLe dataset for text-to-SQL training/tuning at scale. SQaLe is the first large-scale text-to-SQL dataset grounded in real schemas and covers a broad range of realistic queries (eg # joins, operations).

SQaLe provides 517K high-quality question/schema/query triples 🀩.

5 months ago 9 4 1 0

Excited to be at ACL! Join us at the Table Representation Learning workshop tomorrow in room 2.15 to talk about tables and AI.

We also present a paper showing the sensitivity of LLMs in tabular reasoning to e.g. missing vals and duplicates, by @cowolff.bsky.social at 16:50: arxiv.org/abs/2505.07453

8 months ago 6 3 1 0

πŸ§ͺ Paper link: arxiv.org/pdf/2505.07453
πŸ“… I’m presenting Thursday, July 31st at the TRL workshop

I’ll be around all week, so if you’re also interested in tabular learning/understanding and insight retrieval, feel free to reach out to me. I would be happy to connect! (4/4)

8 months ago 0 0 0 0

Turns out:
πŸ”Ή BLEU/BERTScore? Not reliable for evaluating tabular QA capabilities
πŸ”Ή LLMs often struggle with missing values, duplicates, or structural alterations
πŸ”Ή We propose an LLM-as-a-judge method for a more realistic evaluation of the LLMs tabular reasoning capabilities (3/4)

8 months ago 0 0 1 0

The paper's called:
"How well do LLMs reason over tabular data, really?" πŸ“Š

We dig into two important questions:
1️⃣ Are general-purpose LLMs robust with real-world tables?
2️⃣ How should we actually evaluate them? (2/4)

8 months ago 1 1 1 0

Headed to Vienna for ACL and the 4th Tabular Representation Learning Workshop! πŸ‡¦πŸ‡Ή
Super excited to be presenting my first PhD paper there πŸ“„ (1/4)

8 months ago 1 0 1 0

Huge thanks to @madelonhulsebos.bsky.social for all the support on getting this work off the ground on such short notice after I started my PhD πŸ™
And I am excited to keep building on this research!
πŸ“„ Paper link: arxiv.org/pdf/2505.07453

10 months ago 1 0 0 0
Post image Post image

What did we find?
Even on simple tasks like look-up, LLM performance drops significantly as table size increases.
And even on smaller tables, results leave plenty of room for improvement, highlighting major gaps in LLMs' understanding of tabular data and the need for more research on this topic.

10 months ago 1 0 1 0

Furthermore, we extended the existing TQA-Benchmark with some common data perturbations like Missing Values, Duplicates and Column Shuffling.
Using this dataset and the LLM-as-a-judge, we tested the response accuracy to basic reasoning tasks like look-ups, subtractions, averages etc.

10 months ago 0 0 1 0
Advertisement

But only measuring if an answer from a LLM is actually correct turned out to be surprisingly tricky.
πŸ” The standard metrics? BLEU, BERTScore?
They fail to capture the correctness of the outputs given in this space.
So we introduced an alternative:
An LLM-as-a-judge to assess responses more reliably.

10 months ago 0 0 1 0

Tables are everywhere, but so are LLMs these days!
But what happens when these two meet? Do LLMs actually understand tables, when they encounter them for example in a RAG pipeline?
Most benchmarks don’t test this well. So we decided to dig deeper.πŸ‘‡

10 months ago 0 0 1 0

"Can LLMs really reason over tabular data, really?"
That’s the title and central question of my first paper in my new role as a PhD student, which has been accepted to the 4th Table Representation Learning Workshop @ ACL 2025! arxiv.org/pdf/2505.07453

🧡Here’s what we found:

10 months ago 2 1 1 0
Open positions | TRL Lab

Eager to contribute to democratizing insights from tabular data? We have 2 new PhD openings! ✨

1) Fundamental Techniques in Table Representation Learning
2) Reliable AI-powered Tabular Data Analysis Systems

⏰ Apply by: 30 June 2025
πŸ“… Start: Fall/Winter 2025
πŸ”— Info: trl-lab.github.io/open-positions

10 months ago 4 3 0 0
Details about the seminar talk titled TabICL: A Tabular Foundation Model for In-Context Learning on Large Data by Marine Le Morvan

Details about the seminar talk titled TabICL: A Tabular Foundation Model for In-Context Learning on Large Data by Marine Le Morvan

Excited to share the new monthly Table Representation Learning (TRL) Seminar under the ELLIS Amsterdam TRL research theme! To recur every 2nd Friday.

Who: Marine Le Morvan, Inria (in-person)
When: Friday 11 April 4-5pm (+drinks)
Where: L3.36 Lab42 Science Park / Zoom

trl-lab.github.io/trl-seminar/

1 year ago 12 3 0 1