Activation Steering Risks Undermine LLM Safety
New research shows activation steering can increase harmful compliance rates in LLMs, with random vectors raising it from 0% to up to 27% and benign vectors adding another 2‑4%. Read more: getnews.me/activation-steering-risk... #activationsteering #llmsafety #aialignment
0
0
0
0