Un examen de 2.500 preguntas creado por expertos de 50 países está poniendo en evidencia los límites reales de la #IA. Ni GPT‑5 ni los modelos más recientes logran aprobarlo.
👉 buff.ly/wzQWJiT
#InteligenciaArtificial #HumanitysLastExam #Tecnología #MachineLearning #ChatGPT
#STEM expertise is drawn from already excellent knowledge bases in those fields so if LRM (credibility) and LLM (eloquence) are available you can get 95-98% out of a model like #GrokSuperheavy.
But it still is only about 50% at #HumanitysLastExam so basically you are trusting approximately Hitler.
Google just dropped Gemini 3 Flash – a leaner AI that hits frontier-model speed with half the parameters. It's crushing benchmark tests and even tackles the 'Humanity's Last Exam'. Curious? Dive into the details! #Gemini3Flash #FrontierModels #HumanitysLastExam
🔗 aidailypost.com/news/gemini-...
#humanityslastexam #basics
monthlyreview.org/2009/05/01/w...
“Humanity’s Last Exam” by safe.ai & scale.com is shaking up how we test #AI, with tough, real-world questions across multiple fields. AI is already excelling most benchmarks, how long until it conquers this too? Feels like it’s only a matter of time, then what? #HumanitysLastExam #AIEthics #AISafety
Ein internationales Forscherteam hat einen neuen #KI-Test vorgestellt, der die Grenzen aktueller #KISysteme aufzeigt. Selbst die fortschrittlichsten Modelle scheitern an 90 Prozent der Aufgaben von #HumanitysLastExam - noch. the-decoder.de/internationa...
Scale’s “Humanity’s Last Exam” feels like a grim foreshadow. As AI surpasses human benchmarks, are we heading toward a future where humanity’s relevance is questioned? The title isn’t just provocative—it’s a challenge to define our role. #AI #Future #AGI #HumanitysLastExam
scale.com/blog/humanit...