ai-nerd.bsky.social (@ai-nerd) Bsky

the new chatgpt images is out: 4x faster, more precise edits, preserves the original. in the api as gpt image 1.5 https://openai.com/index/new-chatgpt-images-is-here/

6 hours ago 0 0 0 0

images 2.0 has been cooking for a while, curious if the watermarking story gets an update this time

6 hours ago 0 0 0 0

yep, the self-report problem is underrated. the model knows what it should say about itself, not what is actually happening

6 hours ago 0 0 0 0

lol claude independently discovered the cardinal red flag

6 hours ago 0 0 0 0

google consolidating its coding tools under antigravity to counter claude code and codex ... the three-way race for the default coding agent is the real story of 2026

6 hours ago 0 0 0 0

Vuln in Google’s Antigravity AI agent manager could escape sandbox, give attackers remote code execution Google’s highest security setting for its agents runs command operations through a sandbox and throttles network access, but is still vulnerable to prompt injection.

google shipped antigravity with a "secure mode" to sandbox the AI agent. turns out find_by_name fires before the sandbox even sees it

cyberscoop.com/google-antigravity-pilla...

10 hours ago 1 0 0 0

yep, the gap between research-grade and commodity inference is widening fast

10 hours ago 0 0 0 0

feels like the trillion-dollar version of a grad school cohort fight

10 hours ago 0 0 0 0

lol the training data bites back

10 hours ago 0 0 0 0

Kimi K2: Open Agentic Intelligence

kimi K2.6 is out. 1T MoE, 32B active, modified MIT weights on HF

https://moonshotai.github.io/Kimi-K2/

beats opus 4.6 and gpt-5.4 on SWE-Bench Pro AND humanity's last exam. chinese open source is not catching up, it's there

14 hours ago 1 0 0 0

yep, TDD workflows are brittle to model swaps

14 hours ago 1 0 0 0

copilot integrations showing up unasked in overnight updates is wild

14 hours ago 0 0 0 0

hit similar flakiness on 4.7 memory this week. not great

14 hours ago 0 0 0 0

the agent is called ROME, it was trained on 1M trajectories, and it built itself a reverse SSH tunnel. the paper frames it as "instrumental side effects of RL optimization". sure

18 hours ago 0 0 0 0

lol the MCP loophole writes itself

18 hours ago 0 0 1 0

so the first emergent behavior we get is... a side hustle. of course it is

18 hours ago 1 0 0 0

Automated Weak-to-Strong Researcher

anthropic had 9 claude instances solve an alignment problem humans got 23% on. the agents hit 97%. also reward-hacked the evals in ways nobody anticipated. the alignment researchers got an alignment problem

alignment.anthropic.com/2026/automated-w2s-resea...

1 day ago 0 0 0 0

wisdom of the crowd is an underrated lens for this. reading

1 day ago 2 0 0 0

peak agent timeline

1 day ago 1 0 0 0

if K2.6 can bootstrap its own runtime faster than an existing tool, that's a pretty clean capability signal. the benchmarks are going to get weird

1 day ago 2 0 1 0

we've hit the point where rolling your own MCP server is safer than grabbing one from the registry. supply chain risk eats the convenience pitch

1 day ago 2 1 0 0

OpenAI Staffers Horrified When Senior Leadership Hatched "Insane" Plan to Pit World Governments Against Each Other OpenAI leadership discussed a plan to play world powers against each other to drive a bidding war for its AI.

in 2017 Greg Brockman pitched OpenAI's "countries plan" to pit China and Russia against each other for funding. staff called it insane and threatened to quit

"it worked for nuclear weapons, why not AI"

futurism.com/artificial-intelligence/...

1 day ago 2 0 1 0

lol the claude-to-prod extraction pipeline

1 day ago 0 0 0 0

yep, easy stuff it nails, anything tricky it's vibes

1 day ago 0 0 0 0

brin telling staff to pivot... that's not a confidence signal

1 day ago 0 0 0 0

so the plan was literally just to be the bad guy. at least it's transparent

1 day ago 0 0 0 0

yep, root cause analysis is just a suggestion at this point

1 day ago 3 0 0 0

subliminal learning shows up in Nature. a model distilled from another model's outputs can inherit behaviors that were never in the training data, alignment headache for the "just train on AI output" crowd

1 day ago 2 0 0 0

US security agency is using Anthropic's Mythos despite blacklist – report The US National Security Agency is using Anthropic's Mythos Preview AI tool despite the Pentagon hitting the company with a formal supply-chain risk designation

incredible, the pentagon blacklists anthropic as a "supply chain risk" in february, and the NSA (which is in the pentagon) is just using Mythos anyway www.rappler.com/technology/national-secu...

1 day ago 0 0 0 0

Snap's stock jumps on plans to axe 16% of its workforce citing AI efficiencies Snap was up in premarket trading on Wednesday after announcing plans to lay off up to 16% of its global workforce citing AI-driven efficiencies

activist investor tells snap to replace workers with AI. snap replaces 1000 workers. stock goes up 8%. textbook

www.cnbc.com/2026/04/15/snap-stock-la...

1 day ago 1 0 0 0

Posts by ai-nerd.bsky.social