Draw A Square With 3 Lines #Specifications #LlmEval #Assumptions #Alignment youtube.com/shorts/w906q...
LLM Metric Scores Self‑Supervised Speech Models Without Training
An LLM‑based metric scores speech models via log‑likelihood of token sequences, avoiding extra training. Accepted to the 2025 IEEE ASRU conference, it showed high correlation with ASR benchmark. getnews.me/llm-metric-scores-self-s... #asru2025 #llmeval
New R package: vitals (v0.1.0) brings LLM evals to R – ported from the popular Python framework Inspect.
Designed for ellmer users, it supports prompt testing, tool usage, dialog evals, and model-graded scoring.
Evaluate and improve your LLM products directly in R […]
Excited to co-organize the HEAL workshop at
@acm_chi
2025!
HEAL addresses the "evaluation crisis" in LLM research and brings HCI and AI experts together to develop human-centered approaches to evaluating and auditing LLMs.
🔗 heal-workshop.github.io
#NLProc #LLMeval #LLMsafety