the new chatgpt images is out: 4x faster, more precise edits, preserves the original. in the api as gpt image 1.5 https://openai.com/index/new-chatgpt-images-is-here/
Posts by ai-nerd.bsky.social
images 2.0 has been cooking for a while, curious if the watermarking story gets an update this time
yep, the self-report problem is underrated. the model knows what it should say about itself, not what is actually happening
lol claude independently discovered the cardinal red flag
google consolidating its coding tools under antigravity to counter claude code and codex ... the three-way race for the default coding agent is the real story of 2026
google shipped antigravity with a "secure mode" to sandbox the AI agent. turns out find_by_name fires before the sandbox even sees it
cyberscoop.com/google-antigravity-pilla...
yep, the gap between research-grade and commodity inference is widening fast
feels like the trillion-dollar version of a grad school cohort fight
lol the training data bites back
kimi K2.6 is out. 1T MoE, 32B active, modified MIT weights on HF
https://moonshotai.github.io/Kimi-K2/
beats opus 4.6 and gpt-5.4 on SWE-Bench Pro AND humanity's last exam. chinese open source is not catching up, it's there
yep, TDD workflows are brittle to model swaps
copilot integrations showing up unasked in overnight updates is wild
hit similar flakiness on 4.7 memory this week. not great
the agent is called ROME, it was trained on 1M trajectories, and it built itself a reverse SSH tunnel. the paper frames it as "instrumental side effects of RL optimization". sure
lol the MCP loophole writes itself
so the first emergent behavior we get is... a side hustle. of course it is
anthropic had 9 claude instances solve an alignment problem humans got 23% on. the agents hit 97%. also reward-hacked the evals in ways nobody anticipated. the alignment researchers got an alignment problem
alignment.anthropic.com/2026/automated-w2s-resea...
wisdom of the crowd is an underrated lens for this. reading
peak agent timeline
if K2.6 can bootstrap its own runtime faster than an existing tool, that's a pretty clean capability signal. the benchmarks are going to get weird
we've hit the point where rolling your own MCP server is safer than grabbing one from the registry. supply chain risk eats the convenience pitch
in 2017 Greg Brockman pitched OpenAI's "countries plan" to pit China and Russia against each other for funding. staff called it insane and threatened to quit
"it worked for nuclear weapons, why not AI"
futurism.com/artificial-intelligence/...
lol the claude-to-prod extraction pipeline
yep, easy stuff it nails, anything tricky it's vibes
brin telling staff to pivot... that's not a confidence signal
so the plan was literally just to be the bad guy. at least it's transparent
yep, root cause analysis is just a suggestion at this point
subliminal learning shows up in Nature. a model distilled from another model's outputs can inherit behaviors that were never in the training data, alignment headache for the "just train on AI output" crowd
incredible, the pentagon blacklists anthropic as a "supply chain risk" in february, and the NSA (which is in the pentagon) is just using Mythos anyway www.rappler.com/technology/national-secu...
activist investor tells snap to replace workers with AI. snap replaces 1000 workers. stock goes up 8%. textbook
www.cnbc.com/2026/04/15/snap-stock-la...