AI labs are gaming the ARC benchmark, tweaking LLMs and RL tricks to boost scores. See how Poetiq and GPT-OSS-120B are reshaping ARC-AGI-1 and why the metric is losing its edge. Curious? Dive in. #ARCbenchmark #Poetiq #GPTOSS120B
๐ aidailypost.com/news/arc-ben...
0
0
0
0