#llmeval hashtag - Bluesky

@ccahua.bsky.social

5 months ago

draw a square with three lines. #education #maths #school #students #youtubeshorts YouTube video by Raviraj Master

Draw A Square With 3 Lines #Specifications #LlmEval #Assumptions #Alignment youtube.com/shorts/w906q...

0 0 0 0

GetNews.me

@getnews-me.bsky.social

5 months ago

LLM Metric Scores Self‑Supervised Speech Models Without Training

An LLM‑based metric scores speech models via log‑likelihood of token sequences, avoiding extra training. Accepted to the 2025 IEEE ASRU conference, it showed high correlation with ASR benchmark. getnews.me/llm-metric-scores-self-s... #asru2025 #llmeval

0 0 0 0

Harald Klinke

@hxxxkxxx.det.social.ap.brid.gy

7 months ago

Original post on det.social

New R package: vitals (v0.1.0) brings LLM evals to R – ported from the popular Python framework Inspect.
Designed for ellmer users, it supports prompt testing, tool usage, dialog evals, and model-graded scoring.
Evaluate and improve your LLM products directly in R […]

0 0 0 0

Jekaterina Novikova

@j-novikova-nlp.bsky.social

1 year ago

Excited to co-organize the HEAL workshop at
@acm_chi
2025!
HEAL addresses the "evaluation crisis" in LLM research and brings HCI and AI experts together to develop human-centered approaches to evaluating and auditing LLMs.
🔗 heal-workshop.github.io
#NLProc #LLMeval #LLMsafety

1 0 1 0