#HumanitysLastExam hashtag - Bluesky

@recortesblog.bsky.social

1 month ago

Un examen de 2.500 preguntas creado por expertos de 50 países está poniendo en evidencia los límites reales de la #IA. Ni GPT‑5 ni los modelos más recientes logran aprobarlo.

👉 buff.ly/wzQWJiT

#InteligenciaArtificial #HumanitysLastExam #Tecnología #MachineLearning #ChatGPT

1 0 0 0

Hublai (charismatic megafauna)

@hublai.bsky.social

1 month ago

#STEM expertise is drawn from already excellent knowledge bases in those fields so if LRM (credibility) and LLM (eloquence) are available you can get 95-98% out of a model like #GrokSuperheavy.

But it still is only about 50% at #HumanitysLastExam so basically you are trusting approximately Hitler.

0 0 0 0

AI Daily Post

@aidailypost.com

3 months ago

Google just dropped Gemini 3 Flash – a leaner AI that hits frontier-model speed with half the parameters. It's crushing benchmark tests and even tackles the 'Humanity's Last Exam'. Curious? Dive into the details! #Gemini3Flash #FrontierModels #HumanitysLastExam

🔗 aidailypost.com/news/gemini-...

0 0 0 0

@hrealities.bsky.social

1 year ago

Monthly Review | Why Socialism? Is it advisable for one who is not an expert on economic and social issues to express views on the subject of socialism? I believe for a number of reasons that it is.… Clarity about the aims and…

#humanityslastexam #basics

monthlyreview.org/2009/05/01/w...

0 0 0 0

Andy Tseng

@andytseng.bsky.social

1 year ago

Humanity's Last Exam Humanity's Last Exam Dataset

“Humanity’s Last Exam” by safe.ai & scale.com is shaking up how we test #AI, with tough, real-world questions across multiple fields. AI is already excelling most benchmarks, how long until it conquers this too? Feels like it’s only a matter of time, then what? #HumanitysLastExam #AIEthics #AISafety

3 0 1 0

Werner Keil

@wernerkeil.bsky.social

1 year ago

Internationales Forscherteam entwickelt "letzte Prüfung der Menschheit" für KI-Systeme Ein internationales Forscherteam hat einen neuen KI-Test vorgestellt, der die Grenzen aktueller KI-Systeme aufzeigt. Selbst die fortschrittlichsten Modelle scheitern an 90 Prozent der Aufgaben - noch.

Ein internationales Forscherteam hat einen neuen #KI-Test vorgestellt, der die Grenzen aktueller #KISysteme aufzeigt. Selbst die fortschrittlichsten Modelle scheitern an 90 Prozent der Aufgaben von #HumanitysLastExam - noch. the-decoder.de/internationa...

0 0 0 0

Xerxes

@phantomloom.bsky.social

1 year ago

Scale AI and CAIS Unveil Results of Humanity’s Last Exam Scale AI and the Center for AI Safety (CAIS) are proud to publish the results of Humanity’s Last Exam.

Scale’s “Humanity’s Last Exam” feels like a grim foreshadow. As AI surpasses human benchmarks, are we heading toward a future where humanity’s relevance is questioned? The title isn’t just provocative—it’s a challenge to define our role. #AI #Future #AGI #HumanitysLastExam

scale.com/blog/humanit...

1 0 0 0