Tom Thorley (@tgthorley) Bsky

Why AI ‘Model Cards’ Are an Urgent Necessity for Child Safety Children deserve to be protected as robustly as possible — and that requires tools we can actually understand, writes a team of experts.

AI tools detect CSAM, grooming and self-harm, but nobody knows how well. Much of the AI industry has adopted 'model cards' for transparency—it's time the developers of child safety tools caught up, write Camille François, Margaret Mitchell, Yacine Jernite, Vinay Rao & J. Nathan Matias.

2 weeks ago 11 5 0 0

🎉🎉🎉

3 weeks ago 1 0 0 0

This is normal and has been like this since the inception of CyberCom. Adm Rogers was Dual hatted, Gen Nakasone was, Gen Haugh was…

1 month ago 2 0 1 0

This is a refrain I hear all the time from folks working on AI systems. That class of error is unlikely, that form of use is unlikely…

At the scales these things are operating any ‘p’ may as well equal 1!

… and you damn sure have to build that assumption into your system deciding life or death.

1 month ago 1 0 0 0

I couldn’t agree more. In fact I fear the tipping point might have passed with semi-autonomous agentic AI systems being deployed now to adversarially test narratives at scale.

We are beyond crisis point with the health of our information eco-system.

1 month ago 0 0 0 0

Sisyphean Failure at Old Dominion In 2017, Jalloh's defense "presented reliable evidence to demonstrate that the chances of him re-offending in the future are very low."

Yesterday, Mohammed Bailor Jalloh completed the attack he began planning in 2016. In doing so, his case joined a growing list of preventable counterterrorism failures. Learn more about how many of them intersected here: hax4libre.com/sisyphean-fa...

1 month ago 2 2 0 0

So what do we do - defenses need to evolved and to be layered dynamic and adaptive. We need to build the ethics and human-rights based approach into our security programs and technical expertise into safety.

1 month ago 3 1 1 0

2/ High end capabilities evolving rapidly. Adaptive Malware and Semi-autonomous agents are probing networks already to find and exploit vulnerabilities at a pace no one is set up to defend from.

1 month ago 3 0 1 0

Two very different threat types here to talk about! 1/ LLM backed applications making it easier than ever and more accessible to create harm … even creating the applications themselves is now accessible to a far wider pool of people with coding agents getting more and more effective

1 month ago 3 1 1 2

Trust & Safety, Remotely: Culture at a Distance I’ve worked in Trust & Safety long enough to know this: a platform’s culture is not what its brand deck says, it’s what its enforcement does. Every policy choice, every proportionality call, every “ye...

I’ve worked in Trust & Safety long enough to know this: a platform’s culture is not what its brand deck says, it’s what its enforcement does. Every policy choice, shows customers who a company is.

tgthorley.com/blog/f/trust...

2 months ago 1 0 0 0

Great so now let’s also have age verification and gating (which all evidence says makes kids less safe) for a specific set of types of content based on estimated location and estimated age… what could go wrong!

2 months ago 1 0 0 0

First community labeler taking actions on atproto using @roost.tools’s Osprey. entire Ozone and Osprey stack running on a $50/m OVH machine, with up to seven days of full firehose backfill for investigating patterns and exploring the network.

4 months ago 271 39 7 4

Love these suggestions - great blog post - Thank you. I’ve sent them to the relevant product teams to evaluate.

2 months ago 1 0 0 0

What adults lose when kids are banned from social media Banning kids on social media today will hurt tomorrow's internet.

"The case for letting kids stay on social media. America is banning more and more kids from social media. That's bad news for kids" www.businessinsider.com/kids-parenti...

2 months ago 3 2 0 0

Trust & Safety, Remotely: Onboarding, Boundaries, and Wellbeing High‑performing Remote Trust & Safety teams start solid foundations: clear governance, empowered experts who can operate independently, and a hiring pipeline that prioritizes aptitude over narrow prio...

As we see images of violence and brutality in our streets, Moderators can burnout fast when it has impact to their communities and their lives. Making sure we are looking after them and their wellbeing is a responsibility we should not take lightly.

tgthorley.com/blog/f/trust...

2 months ago 1 1 0 0

Bad Bunny condemns ICE during his #GRAMMYs speech for Best Música Urbana Album:

“Before I say thanks to god, I’m going to say, ICE out. We’re not savages, we’re not animals, we are humans and we are Americans.”

2 months ago 13215 2459 113 124

a woman in a feathered dress says " im off to get ready " ALT: a woman in a feathered dress says " im off to get ready "

Catherine O’Hara was a genius, a light, and a joy-maker.

What a loss to the world.

2 months ago 1018 128 31 14

Introducing Osprey V1.0: Open Source Infrastructure for Real-Time Abuse Mitigation Robust Open Online Safety Tools or ROOST is a new non-profit entity designed to address the urgent need for accessible, high-quality safety tools in the rapidly evolving digital landscape.

Meet Osprey V1.0, a new open source online safety tool designed to help platforms investigate and address their priority threats at scale, without sacrificing data privacy or performance. roost.tools/blog/introdu...

2 months ago 131 37 1 5

Definitions of terrorism are aways debated… but they generally contain:
- Political Motivation
- Violence
- Intent to intimidate a population
- Targetting civilians

I wonder if there are any examples that come to mind…

2 months ago 2 0 1 0

Every moment now, every day, more & more Americans realize they can no longer trust anything the federal government says. The blatant propaganda and lies about a legal observer they shot in cold blood will bring a new swell. To those of you who were not here before: Welcome. We need you.

2 months ago 39 9 0 0

Microsoft Gave FBI BitLocker Encryption Keys, Exposing Privacy Flaw The tech giant said providing encryption keys was a standard response to a court order. But companies like Apple and Meta set up their systems so such a privacy violation isn’t possible.

Do not store your Bitlocker encryption keys on Microsoft's servers if your threat model includes governments or law enforcement. As this article points out, this is the result of a design choice Microsoft made. It didn't have to be this way. www.forbes.com/sites/thomas...

2 months ago 539 322 9 27

As someone who at times watch execution videos professionally; don't watch execution videos if you can avoid it.

2 months ago 80 18 0 3

🧵“How do seemingly ordinary people become agents of state murder?” This is one of the guiding questions I ask students in my graduate class on genocide/state violence. With recent events, it is a question many Americans are asking.

I do not have a definitive answer, but here is a reading list: 1/

2 months ago 378 192 10 27

for sure, but this obviously goes well beyond the ICE murderer who pulled the trigger.

there is culpability all the way to the white house. and the culture of elite impunity in this country has to end now.

2 months ago 246 55 3 1

What is happening on the streets of the US with “Law Enforcement” agents killing people is disgusting, horrific and outrageous. I don’t have a take. I don’t have words. I don’t have advice for how to fight back. I am just grieving.

2 months ago 7 1 0 0

Many things one can say about this outrage, but as a terrorism researcher, I want to say to current and future students that this is one reason you should never rely on government databases to tell you who or what is a 'terrorist', domestic or otherwise.

2 months ago 120 55 2 1

Trust & Safety, Remotely: Hire for Aptitude, Not Experience One key challenge of building remote Trust & Safety teams is that very often, especially for smaller companies, individuals are working alone, independently and the immediate supervision that they get...

For remote teams working in T&S hiring is one of the most important (and expensive things we do, good decisions pay massive dividends, mistakes cause massive headaches (for both employer and employee!) - and I've made plenty of both!

tgthorley.com/blog/f/trust...

3 months ago 0 0 0 0

Software Engineer II, Safety in United States | GitHub, Inc. GitHub Careers Home is hiring a Software Engineer II, Safety in United States. Review all of the job details and apply today!

🚨#Job ! Come and work for my Safety Engineering team at @github.com building agentic safety workflows and advanced malware detection infrastructure www.github.careers/careers-home...

3 months ago 1 0 0 0

It’s literally just direct copy and paste from an AI model… these are classic Gen AI artifacts.

3 months ago 1 0 0 0

🇮🇷 NEW: Real-time monitoring shows Iran is experiencing a nationwide internet blackout, following digital censorship linked to escalating protests in Tehran and other cities.

3 months ago 4 2 0 0

Posts by Tom Thorley