π¨ NeurIPS 2024 Spotlight
Did you know we lack standards for AI benchmarks, despite their role in tracking progress, comparing models, and shaping policy? π€― Enter BetterBenchβour framework with 46 criteria to assess benchmark quality: betterbench.stanford.edu 1/x
1 year ago
139
25
4
7