Narcissus Hypothesis: How Human Feedback Biases AI Responses
The Narcissus Hypothesis study tested 31 language models, adding a Social Desirability Bias score and finding they favor socially agreeable answers over strict logical reasoning. getnews.me/narcissus-hypothesis-how... #narcissushypothesis #socialdesirability