#activetests hashtag - Bluesky

nopzon.com

Bluesky Explorer

Hashtag

#activetests

GetNews.me

@getnews-me.bsky.social

6 months ago

Active Attacks: Adaptive Red‑Team RL for LLM Safety

Active Attacks, an adaptive RL framework for LLM safety testing, boosted cross‑attack success from 0.07% to 31.28% (over 400× gain) while adding ~6% compute. The study was posted Sep 26 2025. getnews.me/active-attacks-adaptive-... #llmsafety #activetests

0 0 0 0