Hashtag

#LLMpersona

2 months ago

LLM's response traits—rather than #LLMpersona- drift during interactions, especially in emotionally charged contexts, leading to harmful behaviors. Authors propose activation capping along a learned "Assistant trait" axis & prompt steering to stabilize response traits.
arxiv.org/pdf/2601.10387

0 0 1 0