Kicking off tonight’s PyData London with @johnsandall.bsky.social telling us about Death by RMSE: Cautionary Tales of Metrics Gone Wild
Choosing the wrong eval metrics can look great but give terrible real world performance—it’s always a trade off!
Posts by John Sandall
2️⃣ Are they effectively using techniques like cohorting, backtesting to simulate past futures, and aligning eval metrics to downstream company KPIs?
3️⃣ Would it help to workshop it together? Check out coefficient.ai and we can book in a session, this is exactly what we do at Coefficient!
Some questions to ask your data team:
1️⃣ Are they using the right eval metrics as their "north star"? The wrong model evaluation metric can quietly sabotage your ML team's impact. Choosing between MAE, F1, RMSE and others isn't academic, it's business critical.
The model is trained, eval metrics are great, nothing changes — the KPIs are static, business impact isn’t there, did you optimise for the wrong thing?
I'm speaking at @pydatalondon.bsky.social meetup tomorrow on "Death by RMSE: Cautionary Tales of Metrics Gone Wild"
www.meetup.com/pydata-londo...