Training LLMs with verifiable rewards uses 1bit signal per generated response. This hides why the model failed.
Today, we introduce a simple algorithm that enables the model to learn from any rich feedback!
And then turns it into dense supervision.
(1/n)
Posts by Frederike Lübeck
2 months ago
10
3
1
1
Joint work with Jonas Wildberger, Frederik Träuble, Maximilian Mordig, Sergios Gatidis, Bernhard Schölkopf, and @arkrause.bsky.social
9 months ago
2
0
0
0
Workshop manuscript studying the format shift from structured to unstructured data (openreview.net/pdf?id=Wd05q...)
AdaCVD: Adaptable Cardiovascular Disease Risk Prediction from Heterogeneous Data using LLMs (arxiv.org/pdf/2505.24655)
9 months ago
2
0
0
0
Clinical notes are messy, inconsistent, and unstructured—yet they hold some of the most valuable signals in real-world clinical practice.
Join us today at ICML at the Foundation Models for Structured Data workshop to see how we can make sense of these notes!
📍 West Ballroom D
9 months ago
10
2
2
0