Why Defense-Specific LLM Testing is a Game-Changer for AI Safety In an era where AI models are increasingly deployed in high-stakes environments, generic evaluation tools no longer cut it. That’s...
#aisafety #llmevaluation #defense #hallucinationdetection
Origin | Interest | Match
PsiloQA: Multilingual Dataset for Span-Level Hallucination Detection
PsiloQA, a multilingual dataset for span‑level hallucination detection, covers fourteen languages with auto‑annotated error spans. Encoder models outperformed other methods. getnews.me/psiloqa-multilingual-dat... #hallucinationdetection #multilingual
TraceDet Boosts Hallucination Detection for Diffusion Language Models
TraceDet, a new framework for diffusion language models, boosts hallucination detection by an average 15.2% AUROC improvement, enabling earlier identification of false outputs. getnews.me/tracedet-boosts-hallucin... #tracedet #hallucinationdetection
ACT-ViT Improves Hallucination Detection in Large Language Models
ACT-ViT, a Vision-Transformer that treats LLM activation tensors as images, outperformed traditional probes and was presented at NeurIPS 2025; its code is on GitHub. getnews.me/act-vit-improves-halluci... #actvit #hallucinationdetection
Graph Neural Approach Boosts Hallucination Detection in LLMs
CHARM builds token‑level graphs and uses GNNs to spot hallucinations in LLM output, delivering higher detection accuracy and zero‑shot transfer. Read more: getnews.me/graph-neural-approach-bo... #hallucinationdetection #llm
Semantic Reformulation Entropy Improves Hallucination Detection in QA
Researchers introduced Semantic Reformulation Entropy, a method that improves hallucination detection in QA models and showed improved performance on SQuAD and TriviaQA. Read more: getnews.me/semantic-reformulation-e... #hallucinationdetection #qa
Reference‑Free Token‑Level Hallucination Detection via Variance Signals
A method flags LLM hallucinations by measuring token variance. On SQuAD v2 prompts, 125 M, 1 B and 7 B models showed variance spikes at fabricated tokens. Read more: getnews.me/reference-free-token-lev... #hallucinationdetection #variancesignals
Hallucination Detectors Fail Out-of-Distribution Generalization
EMNLP 2025 Findings shows hallucination detectors drop to linear‑probe levels when spurious cues are removed and perform near‑random on out‑of‑distribution tests. Read more: getnews.me/hallucination-detectors-... #hallucinationdetection #nlp
Semantic Reformulation Entropy Improves Hallucination Detection in QA
Semantic Reformulation Entropy (SRE) boosts hallucination detection in QA, beating baselines on SQuAD and TriviaQA. The study was submitted in September 2025. Read more: getnews.me/semantic-reformulation-e... #hallucinationdetection #qa
FG-PRM Boosts Hallucination Detection in LLM Math Reasoning
The new FG‑PRM model detects and classifies six types of hallucination at each reasoning step, raising accuracy on the GSM8K and MATH math benchmarks. getnews.me/fg-prm-boosts-hallucinat... #fgprm #hallucinationdetection #mathai
Cross-Layer Attention Probing Improves Hallucination Detection in LLMs
CLAP flagged hallucinations in 5 models across three benchmarks; paper submitted on 4 September 2025. It probes layers during inference without model changes. getnews.me/cross-layer-attention-pr... #hallucinationdetection #crosslayerattention