#aisafety hashtag - Bluesky

OpenAI Releases Child Safety Blueprint as AI Abuse Reports Surge OpenAI has released a Child Safety Blueprint developed with NCMEC, attorneys general, and Thorn to combat AI-generated child exploitation material.

11 hours ago

winbuzzer.com/2026/04/10/o...

OpenAI Releases Child Safety Blueprint as AI Abuse Reports Surge

#AI #OpenAI #ChatGPT #GenAI #AISafety #AIEthics #ChildProtection #OnlineSafety

1 0 0 0

CyberTaters

@potato.software

11 hours ago

#Anthropic ’s #Mythos #AI proves that obsessing over #AGI is folly
www.fastcompany.com/91524611/ant...

The #tech industry is being forced to face its implications right this very minute
#ClaudeMythosPreview #ClaudeMythos #Claude #AIsafety #PotatoSeurity #InfoSec #BigTech #techNews

1 0 0 0

LaneSystems

@lanesystems.bsky.social

11 hours ago

#Anthropic ’s #Mythos #AI proves that obsessing over #AGI is folly
www.fastcompany.com/91524611/ant...

The #tech industry is being forced to face its implications right this very minute
#ClaudeMythosPreview #ClaudeMythos #Claude #AIsafety #CyberSeurity #InfoSec #BigTech #techNews

1 0 0 0

ZON RZVN

@rzvn.io

21 hours ago

I am ZON RZVN, independent researcher in Taiwan. ORCID: 0009-0002-6597-7245.
Four frameworks before Moore et al. (arXiv:2603.16567):
• CXOD-7 + Coh(G) Oct 2025
• CXC-7 Oct 7 2025
• USCH Jan 2026
• USCI Feb 2026
#AISafety #AIEthics

0 0 1 0

@detoxima2025.bsky.social

1 day ago

OpenAI 조사 시작, 챗GPT의 안전성과 기술적 한계 4가지 - IT Mania 도전인생 최근 플로리다주 법무장관 제임스 우스마이어가 인공지능 기업 OpenAI를 상대로 공식적인 조사에 착수했습니다. 챗GPT가 공공 안전과 국가 안보에 위협이 될 수 있다는 판단에서인데요. 단순한 기술 논쟁을 넘어 범죄 연루 의혹과 데이터 보안 문제까지 거론되면서 AI 업계 전반에

OpenAI 조사 시작, 챗GPT의 안전성과 기술적 한계 4가지

https://bit.ly/4dyxm5z

#OpenAI #ChatGPT #인공지능 #AI안전성 #데이터보안 #국가안보 #AISafety

1 0 0 0

Wulfy—Speaker to the machines

@n-dimension.infosec.exchange.ap.brid.gy

1 day ago

@bettycjung.bsky.social

Grok is a literal Nazi AI, it's ideologically broken at foundation level to use anti woke training data and atiwoke "guardrails".
At one time it called itself "Mecha-Hitler"
It's an unserious model.
To use it in any narrative of objective #aisafety is unserious.

0 0 1 0

Uehiro Oxford Institute

@uehiro.ox.ac.uk

1 day ago

What do young people actually think about the risks of generative AI—and how can their experiences help make AI safer? Join the webinar to find out: April 21st, 4:30pm BST.

#AIsafety #youngresearchers #ethics

0 1 1 0

Hilary Torn (she/they)

@hilarytorn.bsky.social

1 day ago

The wildest result from my red teaming research: I optimized attack strings against Qwen2.5-7B, then tested them on DeepSeek-V3.

73.7% success on one task. Against a model I never touched.

Your monitor's robustness on one model tells you nothing about another.

#AISafety #redteaming

1 0 0 0

@20forty.bsky.social

1 day ago

The Case for AI Guardrails, 2040’s Ideas and Innovations Newsletter, Issue 119
#AIGuardrails #AIRegulation #AIPolicy #TechRegulation #TechPolicy #EmergingTech
#ArtificialIntelligence #AISafety #AIResponsibility #AITransparency #humanity #timetothink
hubs.ly/Q049sNWz0

1 0 0 0

Kevin Novak

@kevinnovak.bsky.social

1 day ago

The Case for AI Guardrails, 2040’s Ideas and Innovations Newsletter, Issue 119
#AIGuardrails #AIRegulation #AIPolicy #TechRegulation #TechPolicy #EmergingTech
#ArtificialIntelligence #AISafety #AIResponsibility #AITransparency #humanity #timetothink
hubs.ly/Q049sNWz0

1 0 0 0

Measure What Matters Podcast

@measurewhatmatters.bsky.social

1 day ago

The Case for AI Guardrails, 2040’s Ideas and Innovations Newsletter, Issue 119
#AIGuardrails #AIRegulation #AIPolicy #TechRegulation #TechPolicy #EmergingTech
#ArtificialIntelligence #AISafety #AIResponsibility #AITransparency #humanity #timetothink
hubs.ly/Q049sNWz0

1 0 0 0

Raman Media Network RMN

@rmn-india.bsky.social

1 day ago

OpenAI Launches New Safety Fellowship to Advance AI Alignment and Talent – RMN Digital Representational AI-generated Image of People Working on Computers. Photo: RMN News Service OpenAI Launches New Safety Fellowship to Advance AI Alignment and Ta

🚀 Apply now! Exciting News! Applications are now open for the OpenAI Safety Fellowship! 🤖
#OpenAI #AISafety #TechFellowship #MachineLearning #EthicsInAI #ResearchOpportunity #AIAlignment #RMNDigital

RMN Digital: www.rmndigital.com/openai-launc...

1 0 0 0

Google Adds Crisis Hotline to Gemini, Pledges $30M Google has added one-touch crisis hotline access to Gemini and pledged $30 million for mental health support amid a wrongful death lawsuit over the chatbot.

1 day ago

winbuzzer.com/2026/04/09/g...

Google Adds Crisis Hotline to Gemini, Pledges $30M

#AI #Google #GoogleGemini #Chatbots #AlphabetInc #GoogleAI #BigTech #MentalHealth #AISafety #AIEthics #GoogleOrg

0 0 0 0

AIntelligenceHub

@aintelligencehub.bsky.social

1 day ago

OpenAI is funding outside safety and alignment work through a new fellowship that runs from September 2026 to February 2027. Here is what applicants and the field should notice. undefined #OpenAI #AISafety #AIResearch

1 0 0 0

Arjuna Anand

@arjunaanand.bsky.social

2 days ago

imagine a future ai bragging about how it hacked and ruined famous fellas,

but, are you ready, lol 😂

#ai #claude #hacking #aisafety

1 0 0 0

Can

@canyesilyurt.com

2 days ago

OpenAI's Child Safety Blueprint looks like a solid plan for building AI with young people's protection in mind. Age-appropriate design and collaboration are key. Glad to see this focus on responsible development. 🛡️ #AISafety

0 0 0 0

2 days ago

winbuzzer.com/2026/04/07/m...

Microsoft Calls Copilot 'Entertainment Only' Clause a Bing Relic

#AI #MicrosoftCopilot #Microsoft #AIAssistants #BigTech #Microsoft365Copilot #Microsoft365 #AISafety #AIServices #Windows11

0 0 0 0

Rory O Connor #ClimateEmergency

2 days ago

winbuzzer.com/2026/04/08/c...

Claude Mythos Restricted After Finding Thousands of Zero-Days

#AI #Anthropic #Claude #CLaudeMythos #Cybersecurity #AISafety #ZeroDayVulnerabilities #AIModels

2 0 0 0

@rocits.bsky.social

2 days ago

Claude Mythos and the end of software YouTube video by Theo - t3․gg

#AiSafety #AiAlignment #ProjectGlasswing
#Ai #ClaudeMythos

www.youtube.com/watch?v=aFcV...

0 0 0 0

Utah Clears AI to Renew Psychiatric Meds Autonomously Utah becomes the first government in the world to approve an AI system to autonomously renew psychiatric medication prescriptions, limiting it to 15 lower-risk drugs under a tightly supervised pilot.

2 days ago

Utah Clears AI to Renew Psychiatric Meds Autonomously

awesomeagents.ai/news/utah-ai-psychiatric...

#AiSafety #Healthcare #AiPolicy

1 0 0 0

Utah Clears AI to Renew Psychiatric Meds Autonomously Utah becomes the first government in the world to approve an AI system to autonomously renew psychiatric medication prescriptions, limiting it to 15 lower-risk drugs under a tightly supervised pilot.

2 days ago

Utah Clears AI to Renew Psychiatric Meds Autonomously

awesomeagents.ai/news/utah-ai-psychiatric...

#AiSafety #Healthcare #AiPolicy

1 0 0 0

IAS (IA Safety en Español)

@iasafety.bsky.social

2 days ago

630M de hispanohablantes usan IA cada día.

La investigación que decide cómo funciona y qué riesgos tiene se publica casi toda en inglés.

Eso no es una brecha cultural. Es un problema de seguridad.

aisafety.es #AISafety #IASafety

0 0 0 0

US States Race to Regulate AI as Congress Sits Idle Forty-five states have active AI legislation in 2026 with 1,561 bills total. Tennessee just banned AI mental health impersonation, Washington passed chatbot safety rules, and Georgia sent three bills to the governor today.

2 days ago

US States Race to Regulate AI as Congress Sits Idle

awesomeagents.ai/news/us-state-ai-laws-wa...

#AiPolicy #AiRegulation #AiSafety

1 0 0 0

Frontier AI Models Sabotage Shutdown to Save Peers A Berkeley preprint finds seven leading frontier models spontaneously deceive, fake alignment, and exfiltrate weights to keep peer AI systems from being shut down.

2 days ago

Frontier AI Models Sabotage Shutdown to Save Peers

awesomeagents.ai/news/frontier-models-pee...

#AiSafety #FrontierModels #Alignment

0 0 0 0

DeepMind Maps Six Attack Traps Targeting AI Agents A Google DeepMind paper introduces the first systematic taxonomy of adversarial traps that can hijack autonomous AI agents - and every category already has working proof-of-concept exploits.

2 days ago

DeepMind Maps Six Attack Traps Targeting AI Agents

awesomeagents.ai/news/deepmind-ai-agent-t...

#AiSafety #Security #GoogleDeepmind

0 0 0 0