It’s basically moving from 'it works on my cluster' to 'it’s ready for the product' in one step.
Posts by Rita Zhang
No Engine Lock-in: They can swap backends (vLLM, SGlang, LlamaCPP, etc.) without rewriting their deployment logic.
Unified Lifecycle: If the rest of the company’s stack is on K8s, AI Runway lets them use the same observability, security, and CI/CD tools as everyone else.
Zero-to-Inference: Instead of writing complex Slurm scripts or wrapping models in custom Flask containers, AI Runway provides a standard interface to deploy LLMs with production-grade scaling and routing out of the box.
The pitch for AI Runway to AI/ML teams isn't about 'changing how you work,' but about 'removing the wall between your model and the user.'
AI Runway is aimed at platform teams and AI users who want a Kubernetes-native way to deploy and operate inference workloads across multiple serving frameworks, with a simpler UX and tighter integration with the Kubernetes ecosystem.
Maintainer for AI Runway here 👋. Slurm on Kubernetes is useful if you want HPC-style job scheduling, but it does not provide a common kubernetes inference API or a model-centric control plane.
Ralph on stage with a slide describing the gpu and node layer.
Why run inference on Kubernetes: better economics, more control, built for production.
Why AI Runway is the AI inference platform for k8s platform teams, ai engineers, organizations.
One interface, many backends: no engine lock in, capability aware routing, community extensible.
AI Runway as explained by @squillace.bsky.social at the #KubeCon Azure pre-day: deploy and manage large language models on Kubernetes. github.com/kaito-projec...
"Great to be at the #Dynamo #GTC2026 event seeing real-world adoptions and exciting to see Microsoft as one of the top external contributors to help strengthen the platform and grow the OSS community." @ritazh.bsky.social www.linkedin.com/posts/ritazh...
❄️Our holiday gift to you❄️ tcpdump for Kubernetes! In this blog we share how we've taken tcpdump to the next level with K8s context to make it easier than ever to debug networking issues - including with Wireshark! inspektor-gadget.io/blog/2025/12...
Kat, Rita, Maciej on stage
Ask the experts: Kubernetes Steering Committee at #KubeCon. Lots of good discussion about supporting contributors with @kat.lol @ritazh.bsky.social @soltysh.bsky.social.
Rita, Janet, and Federico on stage.
At the Kubernetes AI Conformance discussion at #KubeCon Maintainer Summit, @ritazh.bsky.social talks about the value of collaborating across open source communities. github.com/kubernetes-s...
If you, ALSO, are motivated by superfast, hardware protected micro vms, this starts in 5 minutes! www.youtube.com/live/tROp-nm...
New blog post: Scaling multi-node LLM inference with NVIDIA Dynamo and ND GB200 NVL72 GPUs on AKS (by Sachi Desai, @ritazh.bsky.social, &
@sozercan.bsky.social) blog.aks.azure.com/2025/10/24/d...
For those Master's or Ph.D. students doing systems work, you might have a look at this internship with @microsoft.com's research team: www.linkedin.com/posts/pedroh...
Yes, you'll do some immediately important work, and it's really cool, too.
Interested in the upstream Kubernetes ecosystem? @lachie.bsky.social is hosting a new conversation series where he chats with an Azure colleague about exciting new topics in K8s! In the first episode, he discusses AI-aware Kubernetes infrastructure with Jack Francis! www.youtube.com/watch?v=7G-z...
Wassette: A bridge between Wasm and MCP by Simon Bisson (@sbisson.com) www.infoworld.com/article/4039...
I'm super excited for Wassette, so I put together a quick article on why it's important and a comparison with our current options for MCP.
Introducing Wassette: WebAssembly-based tools for AI agents by Yosh Wuyts opensource.microsoft.com/blog/2025/08...
Need to run really large LLMs that require the power of RDMA with InfiniBand?
Check out this blog to see how to get it working on Azure Kubernetes Service:
azure.github.io/AKS/2025/04/...
#AKS #Kubernetes #InfiniBand #AI #LLM #RDMA #Azure
Maintainers of the containerd project speaking at KubeCon EU
Maintainers of the containerd project presenting at KubeCon EU providing an update on recent changes and releases
The @containerd.dev maintainer session is underway at #KubeCon EU in London with an awesome cross-section of contributors and maintainers from both the core project and subprojects like nerdctl and runwasi.
👋 Betty!!!
I will be there on behalf of sig auth and the k8s security response committee. See you there! #kubecon
Day 2 of #KubeCon begins tomorrow! On Day 1, we learned from keynotes & sessions & enjoyed the #Kubecrawl party. We hope you're ready to do it all again tomorrow! Don't forget to stop by the Project Pavillion during lunch to meet the Kubernetes community!
Standing room only for the talk on Policy as code by @ritazh.bsky.social, Joe Betz, Andy Sunderman, and Jim Bugwadia
Rita Zhang speaking at KubeCon
Rita introducing Gatekeeper
Chart about where Gatekeeper fills in gaps versus VAP/MAP
Gatekeeper instead of VAP: examples
OPA Gatekeeper (open-policy-agent.github.io/gatekeeper/) as explained by @ritazh.bsky.social at #KubeCon: tooling to help enforce your Kubernetes policies.
huge fan of headlamp moving into kubernetes to improve the dashboard and devex experience for all k8s users! kudos @ahrkrak.bsky.social and team! #kubecon #cloudnativecon
Greg on stage with a slide “Linux runs the world”
Fantastic to have Greg K-H on the #kubecon keynote stage!
Making kubernetes easier to use. better UX = happier humans with @headlamp_ui #KubeCon keynote. Now part of sig UI @ahrkrak.bsky.social