Advertisement · 728 × 90
#
Hashtag
#TALN2025
Advertisement · 728 × 90
Post image

Can We Trust the Judges? This is the question we asked in validating factuality evaluation methods via answer perturbation. Check out the results at the #EvalLLM2025 workshop at #TALN2025
Blog: giovannigatti.github.io/trutheval/
Watch: www.youtube.com/watch?v=f0XJ...
Play: github.com/GiovanniGatt...

3 1 0 0