#AgentEvaluation hashtag - Bluesky

nopzon.com

Bluesky Explorer

Hashtag

#AgentEvaluation

AI Daily Post

@aidailypost.com

3 weeks ago

LangSmith just dropped three new portable skills for its CLI, letting coding agents trace runs, curate datasets, and self‑evaluate. Perfect boost for AI engineering workflows. Curious? Dive in! #LangSmithCLI #CodingAgents #AgentEvaluation

🔗 aidailypost.com/news/langsmi...

0 0 0 0

MLflow

@mlflow.org

6 months ago

MLflow lets you create custom scorers for agent behavior: did it use the right tool, in the right order, with proper reasoning? Datasets can encode patterns + decisions, not just input–output. You’re testing how the agent thinks—not just what it outputs.

#AgentEvaluation #MLflow #opensource

1 1 1 0