I just published Part 16 of my Trustworthy AI Agents series: Human-in-the-Loop Governance.
Safe agents require human approval, oversight, and override paths.
www.sakurasky.com/blog/missing...
#AgentOps #AIGovernance
Posts by Andrew Stevens
Published Part 15 in my Trustworthy AI Agents series: Agent-Native Observability.
To debug agents, you need more than logs - you need semantic traces, provenance, lineage, and divergence analysis.
www.sakurasky.com/blog/missing...
#AIEngineering #AgentOps #AIGovernance
A new post in my Trustworthy AI series, Part 14: Secure Memory Governance.
Agents are storing more state than ever, it’s time to secure the memory layer.
www.sakurasky.com/blog/missing...
#AIEngineering #AgentOps
I just published another part in my Trustworthy AI Agents series: Distributed Agent Orchestration.
Agents need a real control plane - routing, scheduling, failover, backpressure.
www.sakurasky.com/blog/missing...
Another post in my series on Trustworth AI - Part 12: Resource Governance.
Deep dive into quotas, throttling, priority scheduling, loop detection, and backpressure for multi-agent systems.
www.sakurasky.com/blog/missing...
#AIEngineering #AgentOps #AIGovernance
Published Part 11 of my Trustworthy AI series.
Deep dive into agent lifecycle management: semantic versioning, immutable builds, CI/CD, safe deprecation, and registry-based governance.
www.sakurasky.com/blog/missing...
#AIEngineering #AgentOps #DevOps #AIGovernance
I just published Part 10 in my Trustworthy AI series.
Deep dive into secure multi-agent protocols: identity, signatures, encryption, nonces, schemas, versioning, and formal verification.
www.sakurasky.com/blog/missing...
#AIEngineering #Security #AgentOps
Published Part 9 of the Trustworthy AI series.
Deep dive into formal verification for agents: invariants, state models, SMT solvers, and counterexample-driven replay.
Python examples included.
www.sakurasky.com/blog/missing...
#AIEngineering #AIDebugging #AIGovernance
I just published Part 8 in my Trustworthy AI series.
Deterministic replay for agent systems: trace capture, replay stubs, clock virtualization, and reproducible debugging.
www.sakurasky.com/blog/missing...
#AIEngineering #AIDebugging #LLMSystems #AgentOps #Observability
Part 7 of my Trustworthy AI series is out.
I take a look at adversarial robustness for agent systems: sanitization, anomaly detection, context stripping, probe detection, and adversarial testing. Python examples included.
www.sakurasky.com/blog/missing...
#AIGovernance #AIEngineering #AgentOps
Published Part 6 of the Trustworthy AI series today.
Deep dive into kill switches, circuit breakers, and runtime safety for autonomous agents, with example Python walk throughs.
Read: www.sakurasky.com/blog/missing...
#AIGovernance #AIEngineering #CloudSecurity #AgentOps #DevSecOps
Dropped a new post in the Trustworthy AI series today.
Deep dive on verifiable audit logs for agent systems: hash chains, Merkle trees, SPIFFE-backed signatures, and AWS anchoring. Practical and code heavy.
www.sakurasky.com/blog/missing...
New post in my "Missing Primitives for Trustworthy AI Agents" series: Policy-as-Code for AI agents.
If agents are making decisions at runtime, the guardrails have to live there too.
OPA, Rego, SPIFFE, and a Python example.
www.sakurasky.com/blog/missing...
Google’s new whitepaper “Introduction to Agents and Agent Architectures” (Nov 2025) - from LLMs that generate outputs to agents that achieve outcomes.
Agents = model + tools + orchestration.
www.kaggle.com/whitepaper-i...
#AI #Agents #LLM #MLOps #AIEngineering
Context drift: how models break when a problem looks the same but isn’t.
New research shows LLMs often “remember” logic puzzles instead of re-reasoning them.
Change a few names or numbers, and performance collapses but confidence stays high.
🔗 arxiv.org/abs/2510.11812
A shift in AI: from systems that generate outputs to systems that model reality.
World models learn from video, sensors & robot data to understand space, time, & cause. The “physics” of the real world.
Robotics that predict reactions, games with real physics, and digital twins that reason.
Can WebAssembly replace containers at the edge?
A new paper benchmarks Wasm vs containers across the Edge–Cloud Continuum. Gains in cold starts & image size, but major I/O & latency trade-offs.
Read here arxiv.org/abs/2510.05118
#WebAssembly #EdgeComputing #Serverless #CloudNative
How do you trust an autonomous AI agent?
In our latest post, we look at workload identity as another missing primitive for trustworthy AI.
Read more on our blog: www.sakurasky.com/blog/missing...
#AI #AISecurity #SPIFFE #WorkloadIdentity #DevSecOps
"Grit" doesn't build a lasting tech services company. Deliberate structure does.
The choices matter:
Reusable IP > Individual heroes
Deep specialization > Chasing low rates
A balanced client portfolio > Relying on one huge account
These are what separate a true partner from a temporary vendor.
Your AI moat isn't the model. It's the data.
But a data moat requires serious engineering:
* Reliable Pipelines
* Clear Lineage
* Automated Quality Gates
* Strong Security
Without these, your proprietary data is a liability, not a defensible asset. Moats are built, not found.
#AI #DataEngineering
Breakthroughs excite investors. Smart innovation sustains organisations.
The hardest call in tech leadership? Knowing when to push a bold idea vs. double down on iteration.
Big wins need both.
#TechLeadership #Innovation #Cloud #Data #Security
Technical debt always gets paid. The only question is when, and who pays it.
Shortcuts show up as:
* Slower velocity
* Security risk
* Talent drain
Treat debt pay-down like security: non-negotiable, budgeted, and strategic.
The speed of next year depends on the cleanup you invest in today.
Are your AI agents actually secure?
In this instalment of our blog series on Trustworthy AI, we explain why true End-to-End Encryption (E2EE) is non-negotiable and provide a hands-on Python example to fix it.
www.sakurasky.com/blog/missing...
Your ping-pong table isn't culture.
For tech teams real culture is a system built on psychological safety, a clear mission, and accountability.
It’s not a soft skill - it’s a core requirement for building reliable and secure systems.
#TechCulture #Leadership
A new paper on hallucination detection has a clever idea: probe all LLM layers at once, not just one (Cross-Layer Attention Probing).
Absolutely worth reading: arxiv.org/pdf/2509.09700
#AI #AIGovernance #LLM
This paper has a pattern for making LLMs reliable for structured data extraction: wrap the model with a domain ontology to define the rules and an automated correction loop to enforce them.
The study is tiny (only 50 test logs) but the architectural pattern is the takeaway
arxiv.org/pdf/2509.00081
Shadow AI is the new shadow IT.
Teams are spinning up LLMs + pipelines outside governance.
The risks? Data leakage, privacy violations, compliance failures.
The challenge? People can build AI faster than you can regulate it.
#AI #Privacy #Compliance