Can We Trust the Judges? This is the question we asked in validating factuality evaluation methods via answer perturbation. Check out the results at the #EvalLLM2025 workshop at #TALN2025
Blog: giovannigatti.github.io/trutheval/
Watch: www.youtube.com/watch?v=f0XJ...
Play: github.com/GiovanniGatt...
3
1
0
0