The Midas Project Watchtower (@safetychanges) Bsky

Anthropic | The Midas Project Updated its RSP noncompliance reporting and anti-retaliation policy

The changes largely seem like an improvement over the former policy, and more frontier AI companies ought to release similar guidance for their employees.

A full diff is available on our website at www.themidasproject.com/watchtower/a...

3 weeks ago 0 0 0 0

Company: Anthropic
Date: March 24th, 2026
Change: Updated its RSP noncompliance reporting and anti-retaliation policy.

3 weeks ago 0 0 1 0

Anthropic | The Midas Project Updated its Frontier Compliance Framework without public announcement

Read the full diff on our website: www.themidasproject.com/watchtower/a...

1 month ago 0 0 0 0

But a broader point, in which we are confident, is that companies should take the “clear and conspicuous” requirement for SB 53 far more seriously.

Updates to their safety frameworks ought to be as legible and well-justified as they can muster.

1 month ago 0 0 1 0

On the other hand, narrowing the scope for AI R&D from measuring a model's general autonomous capabilities to a few specific fields may weaken it.

1 month ago 0 0 1 0

Do these changes make the policy stronger? This is up for debate.

Clearly, adding thresholds for harmful manipulation is an improvement over not having any.

1 month ago 0 0 1 0

Redline of changes in FCF

Luckily, we have our own copy. We’ve made a diff of the changes that you can visit at our website (themidasproject.com/watchtower/a...). Below are the key sections that were modified:

1 month ago 0 0 1 0

Making matters worse, unlike the RSP, Anthropic doesn’t publish past FCF versions to easily compare the text.

While the new document includes a brief changelog at the end, the previous December 2025 framework has essentially been overwritten and erased from the trust center.

1 month ago 0 0 1 0

Instead of a public announcement, Anthropic slipped the updated file into its trust center and left it off its update log.

Does quietly overwriting a PDF (even with a changelog at the bottom) really satisfy the legal requirement for clear and conspicuous disclosure?

1 month ago 1 0 1 0

SB 53 text

Under California’s SB 53, AI developers must "clearly and conspicuously publish" material modifications to their safety frameworks within 30 days.

The statute includes this language to guarantee public oversight when companies alter their binding commitments.

1 month ago 0 0 1 0

FCF changelog

That being said, if you click through to read the document, you will find a changelog at the bottom revealing that a new version has been uploaded (although it doesn’t provide many concrete details of what’s new in V2, or why).

1 month ago 0 0 1 0

Anthropic seemingly did not announce the update.

Clicking the link in the Dec. blog post which first revealed the policy sends you to their trust center, where the new document lives with no obvious mention of the change (including no mention in the trust center’s update log!)

1 month ago 0 0 1 0

Company: Anthropic
Date: March 2nd

You probably didn’t notice, but a few weeks ago, Anthropic quietly updated its legally binding safety framework, the Frontier Compliance Framework (FCF). We took a look at what changed. 🧵

1 month ago 0 0 1 1

Google | The Midas Project Updated their Frontier Safety Framework

On the whole, it's good that Google is continuing to update its risk management policies, and they seem to treat the issue with much more seriousness than some competitors.

Read the full diff at our website: www.themidasproject.com/watchtower/g...

6 months ago 0 0 0 0

Remember that in 2024 Google promised to *define* specific risk thresholds, not explore illustrative examples.

6 months ago 1 0 1 0

Additionally, as pointed out by Zach Stein-Perlman of AI Lab Watch, the CCLs for misalignment, which used to be a concrete (albeit initial) approach, are now described as "exploratory" and "illustrative."

6 months ago 2 0 1 0

Similarly, for ML R&D, models that "can" accelerate AI development no longer require RAND SL 3. Only models that have been used for this purpose count. But this is a strange ordering -- shouldn't the safeguards precede the deployment (and even the training) of such a model?

6 months ago 0 0 1 0

But it's weakened in other ways.

Critical capability levels, which previously focused on capabilities (e.g. "can be used to cause a mass casualty event") now seems to rely on anticipated outcomes (e.g. "resulting in additional expected harm at severe scale")

6 months ago 0 0 1 0

In their blog post, Google describes this as a strengthening of the policy.

And in some ways, it is: they define a new harmful manipulation risk category, and they even soften the claim from v2 that they would only follow their promise if every other company does so as well.

6 months ago 0 0 1 0

Date: September 22, 2025
Company: Google
Change: Released v3 of their Frontier Safety Framework

6 months ago 0 0 1 0

Responsible AI The mission of the Responsible AI and Human Centered Technology (RAI-HCT) team is to conduct research and develop methodologies, technologies, and best practices to ensure AI systems are built respons...

Old page: web.archive.org/web/20250206...

Current page: research.google/teams/respon...

1 year ago 1 0 0 0

Date: Feb 26 - March 6, 2025
Company: Google
Change: Scrubbed mentions of diversity and equity from the mission description of their Responsible AI team.

1 year ago 2 0 1 2

(Removed)

1 year ago 0 0 0 0

(Removed)

1 year ago 0 0 1 0

The smaller changes made to Anthropic's practices:

(Added)

1 year ago 0 0 1 0

The good news is that the details they provide on internal practices have changed very little (select screenshots included in rest of thread).

Now all they need to do is provide transparency on *all* the commitments they've made + when they are choosing to abandon any.

1 year ago 0 0 1 0

Most surprisingly, there is now no record of the former commitments on Anthropic's transparency center, a web resource they launched to track their compliance with voluntary commitments and which they describe as "raising the bar on transparency."

1 year ago 0 0 1 0

In fact, post-election, multiple tech companies confirmed their commitments hadn't changed.

Perhaps they understood that the commitments were not contingent on whatever way the political winds blow, but made to the public at large.

fedscoop.com/voluntary-ai...

1 year ago 0 0 1 0

While there is a new administration in office, nothing in the commitments suggested that the promise was (1) time-bound or (2) contingent on the party affiliation of the sitting president.

1 year ago 0 0 1 0

FACT SHEET: Biden-Harris Administration Secures Voluntary Commitments from Leading Artificial Intelligence Companies to Manage the Risks Posed by AI | The White House Voluntary commitments – underscoring safety, security, and trust – mark a critical step toward developing responsible AIBiden-Harris Administration will

The White House Voluntary Commitments, made in 2023, were a pledge to conduct pre-deployment testing, share information on AI risk management frameworks, invest in cybersecurity, implement bug bounties, and publicly report capabilities and limitations.

bidenwhitehouse.archives.gov/briefing-roo...

1 year ago 0 0 1 0

Posts by The Midas Project Watchtower