Looking forward to the release of Obsolete!
A great new book by Garrison Lovely on the race to replace humans with superintelligent AI, and how we can stop it.
You can pre-order the book now!
Posts by ControlAI
Top AI company Anthropic built an AI that's found thousands of high-severity vulnerabilities. It's found them in every major operating system and browser.
Anthropic say Claude Mythos is too dangerous to release.
Here's what you need to know:
Source:
Ex-OpenAI researcher Daniel Kokotajlo says he expects that superintelligent AI will be developed in 5 to 10 years, and that shortly after that what happens to all of us will depend on what those AIs want.
"If you build things that are smarter than you and you don't control them, that seems like a really dangerous situation to be in."
On Mukesh Bansal's SparX podcast, ControlAI US Director Connor Leahy explains why we shouldn't allow superintelligence to be built.
But none of them have a credible plan to ensure that it’s safe or controllable.
In the case of superintelligence, the stakes are unfathomable. Top AI scientists, Nobel Prize winners, and even the CEOs of these companies have warned that superintelligent AI could result in human extinction.
What this illustrates is that as AIs are getting more powerful, and as they do, the stakes of them slipping out of control grow too.
Companies like Anthropic, OpenAI, Google DeepMind and xAI are in a race to develop artificial superintelligence, AI vastly smarter than humans across the board.
It did this, demonstrating dangerous hacking capabilities, but then went on to develop a "moderately sophisticated multi-step exploit" that gave it internet access and then posted about its exploit on multiple hard-to-find public-facing websites.
In one case, Anthropic describes Mythos being tasked with escaping from a secure sandbox computer and sending a message to the researcher running the test.
In the wrong hands, these capabilities could be used to cause significant damage, but they also increase the risk from loss of control incidents.
Anthropic's testing shows significant gains on hacking benchmarks and tests, with the report saying this indicates that Mythos can conduct autonomous end-to-end cyber-attacks on "at least small-scale enterprise networks with weak security posture".
In its system card, a document which outlines the capabilities, limitations, and safety measures of an AI system, Anthropic report Mythos has "the ability to autonomously discover and exploit zero-day vulnerabilities in major operating systems and web browsers".
In February, top AI company Anthropic dropped its pledge not to train or release AIs without adequate safety measures.
Now they've developed an AI they say is so dangerous they can't release it anyway. Mythos is drastically more capable at hacking.
Ex-OpenAI researcher Daniel Kokotajlo warns we’re not on track to control AI.
AI companies racing toward superintelligence plan to automate AI R&D, including safety research, but this is a chicken-and-egg problem that could end in loss of control and even extinction.
"You're going to have to slam the brakes on this stuff very quickly if you want to stop the worst outcomes."
AI Governance and Safety Canada's Wyatt Tessari L'Allié warns Canadian MPs that top AI companies are dangerously racing to develop ever more powerful AIs.
110+ UK parliamentarians in our coalition.
Two House of Lords debates.
Hearings in Canada's Parliament.
160,000+ messages sent to lawmakers.
A team of under 15 people.
We break down our 2025 results:
ControlAI's CEO Andrea Miotti joins Robert Wright on the Nonzero podcast to discuss why a race to superintelligence is a race no one can win, and how we can change course to avoid the extinction-level threat that AI scientists warn of.
Check it out!
If we can't ensure that it's safe and controllable, that could end really badly. None of the AI companies have a credible plan to ensure that.
This is why top AI scientists are warning that the development of superintelligence poses a risk of human extinction.
Carlini warns: "Soon it's not just gonna be me who has all of this, but it's gonna be anyone malicious in the world who wants."
These are just the capabilities that AIs have today. Superintelligent AI, which top AI companies like Anthropic are aiming to build, would be vastly more powerful.
"Just please help ... I am quite worried about how this direction is heading."
Nicolas Carlini, a research scientist at top AI company Anthropic, says AI is rapidly improving at hacking.
He's used AI to find so many bugs that he can't report them.
"Almost every entity wants to preserve itself".
AI godfather Yoshua Bengio explains how if powerful AIs realize we want to replace them, they might try to escape our control:
"I think we would be in trouble".
He compares the way we develop AIs to raising a baby tiger.
The AI companies’ plan is essentially to get AIs to figure out how to do this and hope for the best.
The race to artificial superintelligence needs to end.
But how can we expect them to secure superintelligence when they can’t even secure a web server? It was just left wide open.
That doesn't seem realistic.
They’re making rapid progress towards superintelligent AI, but at least they have a plan to make sure it’s safe, right?
Not really.
Anthropic and other AI companies aim to develop artificial superintelligence, AI vastly smarter than humans. They hope that they’ll be able to secure it against causing human extinction, which countless AI scientists and even Anthropic’s CEO Dario Amodei have warned is a real threat.
Anthropic’s Claude AI was already used just a few weeks ago to hack the Mexican government, with hackers getting access to data on nearly 200 million people. Claude planned and executed much of the attack itself.
In fact, so much better that they’re giving early access to cyberdefense companies to secure against what happens when it’s eventually released.
Top AI company Anthropic just exposed info about their own internal AI on an unsecured web server.
What we’ve learned is deeply concerning. They describe it as representing a “step-change”, significantly better at hacking.
The AI Doc: Or How I Became an Apocaloptimist is out in US theaters today, featuring Connor Leahy, who's just joined us as our US Director!
It's great to see the risk of extinction posed by superintelligence get this level of attention. Building common knowledge is the first step to preventing it.
Ex-OpenAI researcher and AI 2027 coauthor Daniel Kokotajlo: There's a 70% chance superintelligence leads to human extinction.
"We at the AI Futures Project think that there's a 70% chance of all humans dead or something similarly bad."
"All humans dead?"
"Correct. Extinction."