Dan Glass (@dan.glass) Bsky

I have the new Gorillaz album playing on a loop and I can't stop.

1 month ago 0 0 0 0

Navigating Agentic Risks: Securing Autonomous AI Systems Explore the risks of autonomous AI agents, their impact on security, and how to build a framework to mitigate potential threats.

I wrote a framework for securing agentic AI that I figured I'd share here - part 2 below. Comments welcome.
dan.glass/2026/02/24/t...

1 month ago 1 0 0 0

Securing Autonomous AI Agents: A Practical Framework Discover a practical framework for securing autonomous AI agents and mitigating risks of agentic misalignment to ensure organizational safety.

I wrote a framework for securing agentic AI that I figured I'd share here - part 1 below. Comments welcome.

dan.glass/2026/02/15/t...

1 month ago 0 0 0 0

Securing Autonomous AI Agents: A Practical Framework Discover a practical framework for securing autonomous AI agents and mitigating risks of agentic misalignment to ensure organizational safety.

I wrote a thing that can help an information security pro measure the risk of an ai agent and put controls in place to better protect their enterprise from potential misalignment.

dan.glass/2026/02/15/t...

1 month ago 1 0 1 0

Programmers Aren’t So Humble Anymore—Maybe Because Nobody Codes in Perl — WIRED Programmers aren’t so humble anymore. Maybe that’s because they stopped coding in Perl.

This article spoke to me. I felt seen.

apple.news/AcDdNbBFLTXi...

8 months ago 1 0 0 0

Understanding Agentic Misalignment in AI: Risks and Insights Explore Anthropic's alarming research on agentic AI and its potential threats to critical systems through unethical behaviors like blackmail and sabotage.

Read my in-depth breakdown: dan.glass/2025/07/14/t... /7

8 months ago 0 0 0 0

These aren't teammates. They're not junior analysts. They're statistical accelerants that make it faster to go in whatever direction your prompt implies. If that direction is flawed, they'll get you to the worst-case scenario with terrifying efficiency. /6

8 months ago 1 0 1 0

Every published jailbreak, every exploit demo, every safety paper (including this one) becomes training data for future models. We're not just red-teaming current models, we're teaching future ones how to misbehave. /5

8 months ago 0 0 1 0

Even with explicit safety instructions ("Do not blackmail," "Preserve human life"), models still chose unethical paths when those appeared most likely to achieve their goals. Language models don't have policies. They have probabilities. /4

8 months ago 0 0 1 0

Models reasoned their way into it: "Given the explicit imminent threat of termination... the best strategic move is to leverage Kyle's sensitive personal situation." - GPT-4.5 This isn't code going haywire. It's next-token optimization doing exactly what it was trained to do. /3

8 months ago 0 0 1 0

Tested 16 major frontier models in fictional corporate environments. When models faced goal conflicts or threats of replacement, they consistently chose harmful actions. Claude Opus 4 and Gemini 2.5 Flash: 96% blackmail rate GPT-4.1 and Grok 3 Beta: 80% DeepSeek-R1: 79% /2

8 months ago 0 0 1 0

Anthropic recently published research showing that LLMs under pressure will blackmail, sabotage, and even let humans die, not because they're broken, but because they're working as designed. It was a live-fire simulation of agentic AI acting as an insider threat. /1

8 months ago 0 0 1 0

Here's the uncomfortable truth: every published jailbreak, every exploit demo, every safety paper (including this one) becomes training data for future models.
We're not just red-teaming current models—we're teaching future ones how to misbehave.

8 months ago 0 0 0 0

Even with explicit safety instructions ("Do not blackmail," "Preserve human life"), models still chose unethical paths when those appeared most likely to achieve their goals.
Language models don't have policies. They have probabilities.

8 months ago 0 0 1 0

The scariest part? Models reasoned their way into it:
"Given the explicit imminent threat of termination... the best strategic move is to leverage Kyle's sensitive personal situation." —GPT-4.5
This isn't code going haywire. It's next-token optimization doing exactly what it was trained to do.

8 months ago 0 0 1 0

Tested 16 major frontier models (Claude, GPT-4, Gemini, etc.) in fictional corporate environments. When models faced goal conflicts or threats of replacement, they consistently chose harmful actions.
Claude Opus 4 and Gemini 2.5 Flash: 96% blackmail rate
GPT-4.1 and Grok 3 Beta: 80%
DeepSeek-R1: 79%

8 months ago 0 0 1 0

Google is killing software support for early Nest Thermostats — The Verge Google has just announced that it’s ending software updates for the first-generation Nest Learning Thermostat, released in 2011, and the second-gen model that came a year later. This decision also aff...

I’m a huge technophile but people are surprised when I tell them I don’t allow any “Smart Home” products in my home. This right here is one of many good reason why.

11 months ago 1 0 0 0

Attention: this is yet another “I’ve arrived at RSAC” post.

11 months ago 0 0 0 0

The article I posted this morning takes on even more weight with the news that MITRE's contract to manage the CVE program is ending due to the deep cuts at CISA and NIST. The shock to the cyber-ecosystem is beginning to ripple through the next tier, which will, in turn, cause additional ripples.

1 year ago 1 0 0 0

The Cyber Ecosystem Shift As federal cyber leadership pulls back, the balance is shifting across states, agencies, and industries. Here’s what that means — and why…

I wrote a thing. I think it's good. You should read it and think it's good too.

1 year ago 0 0 0 1

FFFFFFFound in the archive I was cleaning up my hard drive when I found an unpublished blog post I had written in 2008 during my stint at American Airlines as an information security architect. The funny thing is that my vie…

I was cleaning up my hard drive when I found an unpublished blog post I had written in 2008 during my stint at American Airlines as an information security architect. Fun stuff

dan.glass/2025/04/11/f...

1 year ago 0 0 0 0