Is OpenAI o3 AGI? No, a large ensemble model (e.g., boosted trees) achieved 81% at a nano-fraction of compute.
"Moreover, ARC-AGI-1 is now saturating – besides o3's new score, the fact is that a large ensemble of low-compute Kaggle solutions can now score 81% on the private eval."
Posts by Rich Miller
The OpenAI o3 models appear to be (yet another) game changer. The evaluation numbers are jaw-dropping.
Is it AGI? Nope. Not yet.
Best discussion of o3 and ARC-AGI scores, and what they mean, is by the author of the test set. @fchollet.bsky.social
arcprize.org/blog/oai-o3-...
One of Steven Johnson's all-time best:
thelongcontext.com
My advice to those with even a passing interest in #genAI #LLM #AGI #pkm or #BrainScience is to take 20 minutes with no other distractions and read this.
You're welcome.
When I was an active Twitter user, Tweetdeck, a multi-column viewer, was a necessity. Luckily, for Bluesky, all of these features are available in Deck Blue.
deck.blue
Explorations on Bluesky
Feeds are a nice idea. But I find few Feeds that include those terms for which I have a particular interest. No indication of popularity (e.g. subscribers). And ‘likes’ appear to be an paltry.
Not popular topics, or are subscribers stingy with their ‘likes’?
When the previously inconceivable transitions to the merely unpredictable.
You must realize, of course, that I’m referring to deactivating TwitterX. Pulling the plug won’t be difficult emotionally, since I gave up on X almost as soon as it changed hands. But they’ve definitely made the process of disentangling a first class mess.
The exploding bolts have been armed.
Activation imminent.
Good bye to @rhm2k
Now to…
- change all the sig blocks
- eliminate all integrations
- …