Advertisement Β· 728 Γ— 90

Posts by Alex Volkov (Thursd/AI)

πŸ“… ThursdAI - Opus 1M, GPT 5.4 Mini, Composer 2, MiniMax 2.7, Jensen adopts Claw & more AI news
πŸ“… ThursdAI - Opus 1M, GPT 5.4 Mini, Composer 2, MiniMax 2.7, Jensen adopts Claw & more AI news his week on ThursdAI: Jensen Huang dedicates 15 minutes of his GTC keynote to OpenClaw, declaring every company needs an agent strategy. Cursor drops Compose...

youtu.be/dbxkn6EJLvQ

1 month ago 0 0 0 0
Preview
ThursdAI - Opus 1M, Jensen declares OpenClaw as the new Linux, GPT 5.4 Mini & Nano, Minimax 2.7, Composer 2 & more AI news From Weights & Biases, here's what happened in AI this week. Jensen goes ClawPilled with NemoClaw, new smaller GPT 5.4s, MiniMax autoresearches 3.7 and Composer 2 from Cursor beats Opus + more AI

Keeping you up to date with AI, 3 years in a row!

πŸ”— Subscribe to our show on Spotify: thursdai.news/spotify

πŸ”— Apple: thursdai.news/apple

πŸ”— Youtube: thursdai.news/yt

And for the full show notes and links visit

πŸ‘‰ thursdai.news/mar-19 πŸ‘ˆ

1 month ago 0 0 1 0
Post image

Was fairly sick this week, surprisingly this didn't come through.

New pod up, we covered @cursor_ai composer 2, Jensen going all in on OpenClaw, @MiniMax_AI 3.7 and a lot more! Check out our Mar 19 episode on YT, everywhere you get your podcast and right here in comments πŸ‘‡

1 month ago 0 0 1 0

Is X down for any of you?

1 month ago 1 0 0 0
πŸ“… Mar 5 - GPT 5.4 Thinking and pro, Anthropic vs Pentagon, Qwen Drama, Gemini Flash-Lite & more AI
πŸ“… Mar 5 - GPT 5.4 Thinking and pro, Anthropic vs Pentagon, Qwen Drama, Gemini Flash-Lite & more AI GPT 5.4 Thinking dropped WHILE we were recording β€” and we tested it live. OpenAI's new unified reasoning model folds in everything from 5.3 Codex, hits 75% o...

As always, the show is cut into a podcast and a newsletter:

πŸ”— on Spotify: thursdai.news/spotify

πŸ”— Apple: thursdai.news/apple

πŸ“° Newsletter: thursdai.news/mar-5

and Youtube youtu.be/SVimfToLUjA

1 month ago 0 0 0 0
Preview
ThursdAI - Mar 5 - GPT 5.4 is here, Anthropic supply chain risk, Qwen 3.5 small & leadership drama, wolfbench & more AI news From Weights & Biases, we were hoping for a chill "Anthropic vs Gov" week, but then OpenAI dropped GPT 5.4 in the middle of our live stream + Qwen models and corpo drama, Stepfun 3.5 and wolfbench.ai

Great show with cohosts @nisten @ryancarson @ldjconfirmed @Yampeleg and @WolframRvnwlf who presented Wolfbench!

thursdai.news/mar-5

1 month ago 0 0 1 0
Post image Post image

Called it!
OpenAI launched GPT 5.4 live during @thursdai_pod, so we had a chance to test it immediately and found some surprising quirks!

Also covered the Anthropic vs Dow drama, Qwen 3.5 small (and Junyang departure) and more!

Our new episode just dropped, check it out πŸ‘‡

1 month ago 1 0 1 0
Advertisement

Liked this recap & breakdown? Follow me @altryne and @thursdai_pod to stay up to date on everything AI related every week 🫑

1 month ago 0 0 0 0

Overall, the sentiment is VERY positive. As it's usually is for a new model, now... will this be enough to get folks to resume their OpenAI subscriptions after leaving for Anthropic last weekend? We'll see πŸ€”

Blog: openai.com/index/intro...
System Card: openai.com/index/gpt-5...

1 month ago 0 0 1 0

Poe (@poe_platform) are showing that this model is great at long context and needle in a haystack searches.
x.com/poe_platfor...

1 month ago 0 0 1 0

@wadefoster (Wade Foster, CEO of Zapier)
" It's the new state of the art for multi-step tool use"
x.com/wadefoster/...

1 month ago 0 0 1 0

Shumer glazes but has decent feedback as well, showing that it's also, not great at frontend yet. But overall calling it the best by far and that the debate of "which model to use" is over
x.com/mattshumer_...

1 month ago 0 0 1 0

Claire praises this model for "more human speaking" and tool calls, but says it's still bad at frontend
x.com/clairevo/st...

1 month ago 0 0 1 0

Now for some community reactions, though take these with a grain of salt, as there's a bit of a selection bias (OpenAI prefers folks who talk positively about the models)

Bernardo Aceituno, co-founder @stackai said this is the first model to pass their benchmark
x.com/BernAceitun...

1 month ago 0 0 1 0

while Claude opened the website, looked at it, and suggested an actual fix that can help.

1 month ago 0 0 1 0

Is it perfect? nah, we've tested it out live on ThursdAI, and it still shows a bit of a "I'll do EXACTLY what you say" mentality like Codex 5.3. I asked it for 1 thing to improve on my website, and it told me I need to add <main> element

1 month ago 0 0 1 0

One of the most exciting things about this, is not even the model itself, it's the ability to steer its thinking mid thought. I don't think this is talked about enough. Interruption! You can steer this model mid thinking on ChatGPT (and soon Ios) interfaces

1 month ago 0 0 1 0
Advertisement
Post image

More eval goodness, this model SLAPS in math. On the @EpochAIResearch Frontier AI bench, GPT 5.4 it solved a math problem no model has been able to solve before by @nasqret
x.com/nasqret/sta...

1 month ago 0 0 1 0
Post image

However, if we zoom in, we see that it's better at lower thinking thresholds on the same tasks as Codex! Here on SWE-bench pro, you can see that medium score is the same) while the without thinking, the new model gets 47% accuracy!

1 month ago 0 0 1 0
Post image

Now for some benchmarks and evals, if you look at coding things, this model does not show a significant jump over the code dedicated Codex 5.3, but for a generalized model, it's absolutely crushing coding tasks while also being SOTA on a bunch of economic tasks as well, specifically GPDval

1 month ago 0 0 1 0
Post image

While the pricing is for the API, Codex app users can turn on the /fast mode in settings (and CLI users can add it to their ~/.cursor/config.toml file) and enjoy a 1.5x speedup at 2x the token burn!

Beast mode is 1M context at FAST mode!
x.com/providerpro...

1 month ago 0 0 1 0
Post image

First, I think for most Codex users, upgrading from Codex 5.3 is a no brainer, specifically as this model has a 1M context window, is cheaper than Sonnet 4.6

Pricing <272K tokens:
Input $2.50 / Output $15.00 1Mtok (Cached tokens 90%)
Pricing over 272K tokens:
Input $5 / Output $22.50 1Mtok

1 month ago 0 0 1 0
Post image

Today, OpenAI launched their "best model yet", GPT 5.4 Thinking (and 5.4 Pro). After 5.3 Codex just in beginning of Feb, OpenAI has once again pulled back the coding advances into the more generalized model

Pricing, new capabilities, vibechecks in threadπŸ‘‡

1 month ago 1 0 1 0
πŸ“… Feb 26 Approaching singularity: Devin 2, METR 14h, Qwen 3.5, WarClaude, Distill attacks & more
πŸ“… Feb 26 Approaching singularity: Devin 2, METR 14h, Qwen 3.5, WarClaude, Distill attacks & more When we break through the coding singularity, it won't be immediately apparent but there will be signs! Autonomous agents running for 14h straight with tasks...

Or on YT:
www.youtube.com/watch?v=4kW...

1 month ago 0 0 0 0
Preview
πŸ“… ThursdAI - Feb 26 - Approaching singularity From Weights & Biases, this week is the closest I've felt to the AI singularity starting, bonkers 1 man AI startups crossing $700K ARR live on show, DoD vs Anthropic, Anthropic vs Chinese models & mor

You can also just listen to the show right here:

thursdai.news/feb-26

1 month ago 0 0 1 0
Preview
Philip Kiely - guest on ThursdAI podcast Philip Kiely (Baseten) on ThursdAI with Alex Volkov. Listen to episodes, find links, and explore the guest network.

And finally, @philipkiely was there with his first book! Inference is everything as Phillip said! Inference Engineering is available as a free PDF and as a gorgeous book (that I just also got in the mail!)

Don't miss these interviews πŸ‘€

thursdai.news/guests/phil...

1 month ago 0 0 1 0
Advertisement
Preview
Ben Broca - guest on ThursdAI podcast Ben Broca (Polsia) on ThursdAI with Alex Volkov. Listen to episodes, find links, and explore the guest network.

@bencera_ straight up gave us singularity vibes β€” dude crossed $700K ARR LIVE on the show

thursdai.news/guests/benc...

x.com/altryne/sta...

1 month ago 0 0 1 0
Preview
Nader Dabit - guest on ThursdAI podcast Nader Dabit (Cognition) on ThursdAI with Alex Volkov. Listen to episodes, find links, and explore the guest network.

Chatting with @dabit3 was amazing β€” been following his career forever

@nisten even said watching one of Nader's vids changed his whole career path!

Nader just joined @cognition and walked us through why

thursdai.news/guests/dabit3

1 month ago 0 0 1 0
Preview
πŸ“… ThursdAI - Feb 26 - Approaching singularity β€” ThursdAI ThursdAI Feb 26: Anthropic vs Pentagon, METR 14.5h autonomy, Devin 2.2, Qwen 3.5, and Polsia's $700k ARR surge.

You can find the edited version of our live show, show notes and links on our brand new (totally not vibecoded) website here:

thursdai.news/ep/feb-26-2026

1 month ago 0 0 1 0
Post image

3 years doing this weekly and I've never felt closer to the singularity than right now

Everyone's shipping async AI agents. Everything's converging.

This week we covered Anthropic vs DoD, had 3 incredible interviews with @dabit3 @philipkiely and @bencera_ and way more πŸ‘‡

1 month ago 1 0 1 0