#activationsteering hashtag - Bluesky - nopzon.com

Bluesky Explorer

#

Hashtag

#activationsteering

@getnews-me.bsky.social

6 months ago

Activation Steering Risks Undermine LLM Safety

Activation Steering Risks Undermine LLM Safety

New research shows activation steering can increase harmful compliance rates in LLMs, with random vectors raising it from 0% to up to 27% and benign vectors adding another 2‑4%. Read more: getnews.me/activation-steering-risk... #activationsteering #llmsafety #aialignment

0 0 0 0

@getnews-me.bsky.social

6 months ago

Activation Steering Boosts Protein Language Model Design

Activation Steering Boosts Protein Language Model Design

Activation steering, originally for text AI, was adapted to protein language models to guide lysozyme‑like design without extra training, boosting stability and catalytic potential. Read more: getnews.me/activation-steering-boos... #activationsteering #proteins

0 0 0 0