Advertisement · 728 × 90

Posts by Kato Coaching

The "testing mindset" isn't a personality trait. It's trained. Developers often don't think about quality the way QA engineers do because they haven't had the same deliberate practice. That's fixable, but only if the culture reinforces it too.

6 days ago 2 0 0 0

Before you conclude AI isn't worth it, measure how long correction actually takes. One experiment, a real task, less than a day. 

Free guide: resources.kato-coaching.com/5-ai-experiments

#SoftwareTesting #QA #AIinTesting

1 week ago 0 0 0 0
Can AI save the Easter Bunny?
Can AI save the Easter Bunny? Before you hand a task to AI, you need to know if it's actually a good candidate. The Easter Bunny has 10,000 eggs to hide across 47 gardens before sunrise. ...

The Easter Bunny has 10,000 eggs to hide across 47 gardens before sunrise. His team suggests using AI to plan the hiding spots.

Before handing a task to AI, run it through a simple framework. Here is how the Easter Bunny uses this framework to avoid a disaster. 🐇🥚

2 weeks ago 0 0 0 0
Preview
The AI Use Case Scorecard (Free) Most AI output problems start before you open the tool. This free scorecard helps you check before you spend hours correcting output.

Getting the task type wrong makes the model choice irrelevant. The scorecard tells you what kind of solution actually fits before you commit to one. 

https://resources.kato-coaching.com/scorecard

#SoftwareTesting #AItools #QA

2 weeks ago 1 0 1 0

TIL: it's much easier to get Claude Code to play nicely with the Google workspace MCP than it is to get Gemini to use it. What the hell? Yes, I am experimenting with a multiple agent setup, but man, it's painful.

2 weeks ago 2 0 0 0
How to Evaluate AI-Generated Test Scenarios
How to Evaluate AI-Generated Test Scenarios AI can give you a decent list of test scenarios - if you prompt it well. But the list is starting material, not a test plan. The work of deciding which scena...

AI gave you 12 scenarios. Now what? The list is raw material, not a test plan, and your judgment is what turns it into one. New video on how to triage what AI gives you: https://youtu.be/0awmamL1l2c

#softwaretesting #AIinTesting

2 weeks ago 0 0 0 0

☝️☝️☝️

3 weeks ago 0 0 1 0
Preview
The Hard Part of AI Evals Isn't the Tooling - Kato Coaching Additional thoughts on session three of the AI Evals and Analytics Playbook1. The first part is here2. Every major shift in how we build and ship software has been followed by a wave of tooling that automates the tractable part and leaves the actual problem to the practitioner. Agile gave us story pointing ceremonies and […]

Agile gave us JIRA. DevOps gave us pipelines. Neither answered the harder question. AI evals is doing the same thing: the tooling is excellent. The question underneath it remains. 

kato-coaching.com/the-hard-part-of-ai-eval...

#AITesting #QualityEngineering #SoftwareTesting

3 weeks ago 0 0 0 0
Preview
The AI Use Case Scorecard (Free) Most AI output problems start before you open the tool. This free scorecard helps you check before you spend hours correcting output.

Most AI pilots produce impressions, not evidence. An acceptable error rate isn't a feeling. It's a number you write down before the experiment starts. 

https://resources.kato-coaching.com/scorecard

#SoftwareTesting #AItools #QA

3 weeks ago 1 0 1 0
Preview
The Verification Trap - Kato Coaching Someone posted a thinking piece in a Slack channel I’m in last week, long and earnest and well-structured, arguing that quality engineering needs to evolve for the age of agentic AI, that we need to stop thinking about quality as testing and start treating it as a systemic property. I read it and felt a […]

"The profession keeps arriving at “quality is systemic” as though it’s a fresh revelation, every time a new technology makes the verification frame obviously inadequate. It happened with continuous delivery. [...] It is happening now with AI agents."

3 weeks ago 1 0 0 0
Advertisement
Preview
The AI Use Case Scorecard (Free) Most AI output problems start before you open the tool. This free scorecard helps you check before you spend hours correcting output.

Every AI demo works because the vendor picked the use case. When you're back at your own codebase, the conditions are different. Define your problem before you evaluate the tool.

 https://resources.kato-coaching.com/scorecard

#SoftwareTesting #AItools #QA

3 weeks ago 1 0 0 0
How to write AI prompts that give you useful test scenarios
How to write AI prompts that give you useful test scenarios Most AI prompts for testing are vague — and you get vague output back. "Write test cases for the SplitPay feature" returns a generic list you already knew. N...

The vague prompt gets you a generic list. Two minutes of context - scope, users, risk angle - gets you scenarios worth thinking about. Short video on how to close that gap. 
https://youtu.be/55SLTvKq7rE

#softwaretesting #AIinTesting

3 weeks ago 0 0 0 0
Preview
The anatomy of a metric - Kato Coaching Session two closed with a question I couldn’t answer.1 When a product scores 78% on a given metric, what tells you whether that’s good enough to ship? I flagged it as something session three would probably address, and it did, though not in the way I expected. The question can’t be meaningfully answered until you’ve […]

The AI evals field calls it a rubric. If you've done serious testing work, you probably know it by a different name. 
https://kato-coaching.com/the-anatomy-of-a-metric/

#SoftwareTesting #AITesting

4 weeks ago 0 0 0 0
Preview
The AI Use Case Scorecard (Free) Most AI output problems start before you open the tool. This free scorecard helps you check before you spend hours correcting output.

When AI adoption is handed to IT, you get a deployment. Deciding what good output looks like and how to verify it belongs to the people doing the work.

resources.kato-coaching.com/scorecard

#QualityEngineering #TechLeadership

1 month ago 0 0 0 0
Preview
The AI Use Case Scorecard (Free) Most AI output problems start before you open the tool. This free scorecard helps you check before you spend hours correcting output.

When teams say the AI tool isn't working, I ask about the process before the AI arrived. Usually the requirements were always vague. The tool is just making that visible.

https://resources.kato-coaching.com/scorecard

#SoftwareTesting #AIinTesting #QA

1 month ago 0 0 0 0
Preview
The AI Use Case Scorecard (Free) Most AI output problems start before you open the tool. This free scorecard helps you check before you spend hours correcting output.

Six months into an AI pilot with no success criteria is just six months of accumulated impressions. 
You can't pass or fail a hope. 

1 month ago 0 0 0 0
Post image

I am working on a food tracker for my very specific needs, and Claude just turned into every developer I ever worked with.

1 month ago 1 1 0 0
Advertisement
Preview
The AI evals field chose a flawed tool and stuck with it - Kato Coaching Session one left me with two things I hadn’t resolved.1 The first was a line the instructor said almost in passing: “the hard part is scalability, not automation.” I wrote it down because it piqued something, but I couldn’t quite work out what problem it was pointing at. The second was a question I kept […]

"The hard part is scalability, not automation." That line from session one of "AI evals and analytics" confused me. Session two explained it. 
Full write-up in my blog: kato-coaching.com/the-ai-evals-field-chose...

#AIEvals #SoftwareTesting #QA

1 month ago 0 0 0 0

Updated my website this week — it should finally be clear what I do and how to work with me. Courses, workshops, and 1:1 coaching for QA professionals. 

kato-coaching.com

1 month ago 0 0 0 0
Preview
The AI Use Case Scorecard (Free) Most AI output problems start before you open the tool. This free scorecard helps you check before you spend hours correcting output.

If the output is slop regardless of how you phrase it, the problem isn't the prompt. It's the use case. 

Free scorecard:

1 month ago 1 0 0 0

Before I wrote today's post, I defined what good looked like: gives value, doesn't rely on outrage, sounds like me. Writing to a clear brief changes the experience. So does diagnosing a draft. Testers do this before they run anything.

1 month ago 0 0 0 0

The AI correction loop usually starts before the tool is opened, at the moment someone chose the wrong use case for it. Testers already know how to ask whether a tool suits a problem. That skill just hasn't been applied here yet. More soon.

1 month ago 0 0 0 0

The skills QA professionals already have (defining success criteria, testing behaviour, not trusting metrics at face value) are exactly what's missing from most AI integrations. I'm learning AI evals to understand why. 
Session one: kato-coaching.com/what-i-dont-understand-about-ai-evals-yet/

1 month ago 0 0 1 0

Ran my workshop "Deciding Fast" on Tuesday with a software team in Sweden. Everyone in one room sharing computers, no breakouts. It's built for remote. I adapted. 6/8 Good or Excellent. Best response to "what will you do differently?": "Set clearer success condition."

#SoftwareTesting #AITesting

1 month ago 0 0 0 0
Preview
Anthropic Education Report: The AI Fluency Index Anthropic's AI Fluency Index measures 11 observable behaviors across thousands of Claude.ai conversations to understand how people develop AI collaboration skills.

"Tell me what you're uncertain about." "Push back if my assumptions are wrong." Only 30% of people give instructions like these. The model defaults to confident and agreeable. 

https://www.anthropic.com/research/AI-fluency-index

#SoftwareTesting #AITesting #AILiteracy

1 month ago 0 0 0 0
Advertisement
Preview
Anthropic Education Report: The AI Fluency Index Anthropic's AI Fluency Index measures 11 observable behaviors across thousands of Claude.ai conversations to understand how people develop AI collaboration skills.

The strongest predictor of AI fluency, per Anthropic's research: iteration. Treating the first response as a draft, not an answer. 5.6x more likely to question reasoning. Familiar territory if you work in testing. 
https://www.anthropic.com/research/AI-fluency-index

#SoftwareTesting #AITesting

1 month ago 0 0 0 0

Sounds really interesting! I hope you can share outside of that conference presentation, I’d love to hear more when you have it.

1 month ago 0 0 0 0

That’s a great approach. I presume you don’t want to spoil your punchline by telling us how it’s going?

2 months ago 0 0 1 0

Six months in and nobody can say whether the AI is actually working. Not anecdotally, but with evidence. That gap is the most common thing I see in QA teams right now. 
How do you measure if the new licence is worth the money?

#softwaretesting #QA #AItools

2 months ago 2 0 0 0

18% of testers I surveyed said their top AI frustration is not bad output. It is that the tools have no sense of test strategy.
The AI is not wrong. It is indiscriminate.

#AITesting #SoftwareTesting #TestStrategy

2 months ago 0 0 0 0