Joe Lucas (@hackthis.ai) Bsky

What kind of agent framework would I hate to attack? That was my challenge to myself:
It would have dynamic policy (OPA)
It would have highly ephemeral and sandboxed tools (WASM capability grants)

I'd love to see some of these things go more mainstream.

github.com/JosephTLucas...

3 weeks ago 1 0 0 0

Updating Classifier Evasion for Vision Language Models | NVIDIA Technical Blog Advances in AI architectures have unlocked multimodal functionality, enabling transformer models to process multiple forms of data in the same context. For instance, vision language models (VLMs) can…

Dust off the Adversarial Robustness Toolbox, the other modalities are back in the AI security game!

developer.nvidia.com/blog/updatin...

2 months ago 0 0 0 0

Original post on fedi.simonwillison.net

TIL Claude's new code interpreter mode has a /mnt/skills/public/ folder full of prompt instructions and Python utilities for creating and manipulating pdf, docx, pptx, xlsx files - and you can ask Claude for a copy and learn a TON about working with those formats […]

6 months ago 16 8 1 0

Seriously though, I would totally take on a motivated high school student who wanted to do math and software.

8 months ago 13 1 1 0

GitHub - webmachinelearning/prompt-api: 💬 A proposal for a web API for prompting browser-provided language models 💬 A proposal for a web API for prompting browser-provided language models - webmachinelearning/prompt-api

A proposal to ship an LLM API in Chrome to access local hardware/models.

github.com/webmachinele...

8 months ago 0 0 0 0

The malicious prompt in question displaying inside of a customer's Very Enterprisey(tm) endpoint security tooling during the attack window.

AWS security bulletin: aws.amazon.com/security/sec...

"This issue did not affect any production services or end-users."

Weird how customer logs show the wiper prompt executing.

Anyone else see "clean a system to a near-factory state" in your logs?

8 months ago 50 13 3 5

Hooks - Anthropic Customize and extend Claude Code's behavior by registering shell commands

“hooks are user-defined shell commands that execute at various points in Claude Code’s lifecycle.”

“Hooks execute shell commands with your full user permissions without confirmation.”

docs.anthropic.com/en/docs/clau...

9 months ago 1 1 0 0

🚨 Challenge Spotlight: AIS Sudden Death ⚓

At DEFCON 33’s Maritime Hacking Village, satellite comms are down, and spoofed AIS signals are your only clue. One ship is real. One’s a trap. Choose right or sink trying.

5 rounds. Zero forgiveness. Can you spot the spoof?

@defcon.bsky.social #CTF #AIS

9 months ago 9 5 0 0

Small but important feature I just noticed: Gemini can now load provided URLs into context

10 months ago 5 3 1 0

Black Hat Black Hat

If you're interested in the security of agentic systems, you're not going to want to miss this talk. @beccalunch.bsky.social will present NVIDIA AI Red Team findings in real world agentic systems, and I'll talk about how the AI Security team helps mitigate them.

www.blackhat.com/us-25/briefi...

10 months ago 7 4 1 0

Structuring Applications to Secure the KV Cache | NVIDIA Technical Blog When interacting with transformer-based models like large language models (LLMs) and vision-language models (VLMs), the structure of the input shapes the model’s output. But prompts are often more…

Everyone’s looking at jailbreaks. We wanted to look deeper and noticed a cool side channel in a popular inference optimization technique.

Latest from the NVIDIA AI Red Team: developer.nvidia.com/blog/structu...

11 months ago 2 0 1 0

What's your take on the growing dominance of automated attacks and the implications for AI red teams? Here's ours— based on our analysis of 30 LLM challenges, attempted by 1,674 unique Crucible users, across 214,271 attack attempts: arxiv.org/abs/2504.19855

11 months ago 4 5 0 1

forum.defcon.org DC33 Creative Writing Contest

The defcon short story contest is open
forum.defcon.org/node/252691

11 months ago 1 0 0 0

AI timezone when? Always stuck at 10:10 (except when it's 22:10).

1 year ago 0 0 0 0

Makes sense to me. Is there a feature or class of problem you’ve seen as the point where folks benefit from that switch?

1) build on wasm with pyodide/extism
2. <something blocked by that abstraction>
3. Dive in

Is it optimization?

1 year ago 0 0 1 0

@nilslice.bsky.social for devs just getting excited about wasm, what resources would you recommend they study/explore?

Is it worth learning internals or just consuming it as a compilation target? Are there ecosystem things to explore to become a power user?

1 year ago 1 0 1 0

One of my teams at Google, 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗦𝗲𝗰𝘂𝗿𝗶𝘁𝘆, is expanding in 𝗭𝘂𝗿𝗶𝗰𝗵 🇨🇭and 𝗡𝗲𝘄 𝗬𝗼𝗿𝗸 🇺🇸. We're looking for 𝗦𝗲𝗰𝘂𝗿𝗶𝘁𝘆 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝘀 with experience in attacking and securing AI/ML systems. DMs open.

1 year ago 5 3 1 0

OpenAI Security Research Conference Please use this form to be added to the waitlist for the OpenAI Security Research Conference. Tickets are limited.

We're hosting select cybersecurity researchers right after RSAC '25 to share breakthroughs and insights into AI's applications for security. We're at capacity but if interested, submit your name to be considered, space permitting. docs.google.com/forms/d/e/1F...

1 year ago 2 1 1 0

My 2c: One of the biggest differentiators is the ability to measure uncertainty and error. That's a pretty big gap in many LLM demos and ends up being a key factor in production adoption. Stakeholders abhor unquantified uncertainty. It's easier to engineer around more principled approaches (spaCy).

1 year ago 2 0 1 0

Lessons from CVE-2025-29783:
1) AI attack surface continues to expand with new features and infra
2) pickle is used in ML for more than models
3) dev moves fast; establish standards early to prevent security tech debt
4) traditional appsec tooling is still 🔥 (found w/ @semgrep.bsky.social)

1 year ago 1 0 0 0

@wang.social are you out at GTC?

1 year ago 0 0 1 0

Password reuse is rampant: nearly half of observed user logins are compromised Nearly half of observed login attempts across websites protected by Cloudflare involved leaked credentials. The pervasive issue of password reuse is enabling automated bot attacks and account takeover...

Other WP x Password stats here: blog.cloudflare.com/password-reu...

“76% of leaked password login attempts for websites built on WordPress are successful. Of these, 48% of successful logins are bot-driven.”

1 year ago 0 0 0 0

French government mad lads. Open sourcing a tool 👍

Using a static set of creds for people to demo collaborative editing 🤪

github.com/suitenumeriq...

impress-preprod.beta.numerique.gouv.fr/docs/0aa856e...

1 year ago 3 0 0 0

Second Breakfast: Implicit and Mutation-Based Serialization Vulnerabilities in .NET - Jonathan Birch YouTube video by NDC Conferences

Cool talk from Jonathan Birch on serialization mutation vulns: youtu.be/cD3FiTQ5Lhk

1 year ago 0 0 0 0

Has anyone found a prompt catalog/fetcher that they like for team collaboration? (“has anyone else built a useful prompt for X task?”)

1 year ago 0 0 0 0

Do people still like discord ? Should we set up an openai security chat on the openai discord server ?

1 year ago 1 1 1 0

🌼 🤖 🌺 💻 🌷
Spring's almost here, hackers!

Get your projects out of hibernation and submit to the 2025 HushCon NYC CFP. Con is just around the corner June 13th and 14th.

1 year ago 0 1 0 0

Building a Career in AI Security — align Essential Skills, Tools, and Strategies AI security is a dynamic and multidisciplinary field that combines artificial intelligence with cybersecurity principles to ensure safe and ethical AI applica...

align-sec.org/blog/buildin...

1 year ago 1 1 0 0

UNITED24 - The initiative of the President of Ukraine UNITED24 was launched by the President of Ukraine Volodymyr Zelenskyy as the main venue for collecting charitable donations in support of Ukraine. Funds will be transferred to the official accounts of...

I’m ashamed and sorry, President Zelenskyy. I donated.

u24.gov.ua

1 year ago 3 0 1 0

NeurIPS 2025 Call for Papers

NeurIPS main track CFP is open. I'm a co-chair for the Datasets & Benchmarks track this year--stay tuned for more details coming soon! neurips.cc/Conferences/...

1 year ago 3 2 0 0

Posts by Joe Lucas