Advertisement · 728 × 90

Posts by Suraj Deshmukh | सुरज देशमुख

Preview
skills/skills/skill-creator at main · anthropics/skills Public repository for Agent Skills. Contribute to anthropics/skills development by creating an account on GitHub.

Official™️ Claude Code skill from Anthropic that creates Claude Code skills:
github.com/anthropics/s...

2 weeks ago 2 1 1 0
Agentic Engineering Patterns - Simon Willison's Weblog

3/3

What stuck with me most: follow your curiosity. Whatever you build becomes the foundation for the next thing.
simonwillison.net/guides/agent...

#AgenticEngineering #AI #SoftwareEngineering #TDD #CodingWithAI #LLMAgents #BuildInPublic #DevProductivity #AITools #CuriousBuilder

3 weeks ago 2 0 0 0
Agentic Engineering Patterns - Simon Willison's Weblog

2/n

Agentic engineering is structured: start with TDD, run tests before touching code, delegate to sub-agents for test execution, automate your manual tests, and move seamlessly from prototype to real implementation.

3 weeks ago 1 0 1 0

1/n

I just spent time reading @simonwillison.net’s guide on agentic engineering patterns and it shifted how I think about coding with AI.

The mindset isn’t “let the AI figure it out” — that’s vibe-coding™️.

3 weeks ago 1 0 2 0
Preview
Living dangerously with Claude I gave a talk last night at Claude Code Anonymous in San Francisco, the unofficial meetup for coding agent enthusiasts. I decided to talk about a dichotomy I’ve been struggling …

Living dangerously with Claude
simonwillison.net/2025/Oct/22/...

3 weeks ago 0 0 0 0
GitHub does dotfiles - dotfiles.github.io

Github has a recommendation on doing dotfiles:
dotfiles.github.io

1 month ago 0 0 0 0
Preview
Setting Up OpenClaw with Azure AI Foundry Learn how to configure OpenClaw to use Azure AI Foundry models, giving you a self-hosted AI assistant accessible from Telegram and other chat apps.

I just published a new guide on configuring #OpenClaw 🦀 to run with #Azure AI Foundry models. You control data control, so more privacy, talk to it from #Telegram or using the console!

Check it out here: suraj.io/post/2026/op...

1 month ago 2 0 0 0
Preview
Running Linux Containers Natively on macOS with Apple's Container CLI Learn how to use Apple's container CLI tool to run Linux containers as lightweight VMs on macOS with sub-second startup times

Apple has a new native container CLI for macOS! Run Linux containers without Docker Desktop—with sub-second startup times. 🚀

My guide covers setup, resource limits, and fixing macOS firewall blocks:
🔗 suraj.io/post/2026/us...

#macOS #Containers

1 month ago 2 0 0 0
goodreads — ClawHub Search for books, get book details and reviews, discover personalized recommendations, and manage reading lists on Goodreads — all through browser automation.

Try it:

/goodreads tell me about project hail mary by andy weir
/goodreads add the midnight library to my want to read shelf

Install with one command:
clawhub install goodreads

🔗 clawhub.ai/surajssd/goodreads
🐙 github.com/surajssd/openclaw-goodreads-skill

1 month ago 0 0 0 0
goodreads — ClawHub Search for books, get book details and reviews, discover personalized recommendations, and manage reading lists on Goodreads — all through browser automation.

2/n: Since Goodreads deprecated their API in 2020, this skill uses browser automation under the hood. No API keys (but you'd need to login once) — just the browser tool doing what you'd do manually!

1 month ago 0 0 1 0
Advertisement
goodreads — ClawHub Search for books, get book details and reviews, discover personalized recommendations, and manage reading lists on Goodreads — all through browser automation.

1/n 📚 Made something for fellow book nerds using Openclaw:

A Goodreads skill that lets your AI agent search for books, pull up details & reviews, get personalized recommendations, and manage your reading lists — all through natural language.

1 month ago 1 0 1 0
Preview
Deploying Kimi K2.5 on Azure: A Complete Guide to Running MoonshotAI's Model Learn how to deploy and configure Kimi K2.5 on Azure AI Foundry with this step-by-step guide.

Deploying #Kimi K2.5 on #Azure: A Complete Guide to Running MoonshotAI's Model suraj.io/post/2026/de...

2 months ago 0 0 0 0
Preview
Running Pydantic’s Monty Rust sandboxed Python subset in WebAssembly There’s a jargon-filled headline for you! Everyone’s building sandboxes for running untrusted code right now, and Pydantic’s latest attempt, Monty, provides a custom Python-like language (a subset of ...

Running Pydantic’s Monty Rust sandboxed Python subset in WebAssembly

simonwillison.net/2026/Feb/6/p...

2 months ago 2 2 0 0
Preview
Handy Handy is a cross platform, open-source, speech-to-text application for your computer

Thanks to @scott.hanselman.com for showing me Handy (handy.computer) — a free, open-source speech-to-text tool that runs locally on your machine. Push-to-talk, privacy-focused, and just works. Check it out!

2 months ago 42 13 2 0
Preview
Running Docker Commands on a Remote Machine via SSH Learn how to execute Docker commands on a remote machine from your local terminal using SSH and Docker contexts

Running Docker Commands on a Remote Machine via SSH suraj.io/post/2026/re...

#docker #ssh #remote #containers #cli #development #devops

2 months ago 0 0 0 0
Preview
Using Claude Code with GitHub-Hosted Anthropic Models Learn how to use Claude Code CLI with GitHub Models by proxying requests through litellm-proxy

Using Claude Code with GitHub-Hosted Anthropic Models suraj.io/post/2026/us... #claude #github-models #ai #litellm #anthropic

2 months ago 0 0 0 0
Meta’s Kubernetes-based Portable AI Research Environment - Shaun Hopper, Meta & Navarre Pratt
Meta’s Kubernetes-based Portable AI Research Environment - Shaun Hopper, Meta & Navarre Pratt YouTube video by CNCF [Cloud Native Computing Foundation]

Meta’s Kubernetes-based Portable AI Research Environment youtu.be/ts7bI51gRCo?...

4 months ago 1 0 0 0
LLMs on Kubernetes: Squeeze 5x GPU Efficiency With Cache, Route, Repea... Yuhan Liu & Suraj Deshmukh
LLMs on Kubernetes: Squeeze 5x GPU Efficiency With Cache, Route, Repea... Yuhan Liu & Suraj Deshmukh YouTube video by CNCF [Cloud Native Computing Foundation]

Our talk (me & Yuhan Liu) on improving LLM serving efficienty is on YouTube now!
youtu.be/2YCDvZokqnk?...

#vllm #kubernetes #kubecon

4 months ago 3 0 0 0
Advertisement
Preview
Infinite scale: The architecture behind the Azure AI superfactory - The Official Microsoft Blog Today, we are unveiling the next Fairwater site of Azure AI datacenters in Atlanta, Georgia. This purpose-built datacenter is connected to our first Fairwater site in Wisconsin, prior generations of A...

Infinite scale: The architecture behind the Azure AI superfactory

blogs.microsoft.com/blog/2025/11...

4 months ago 2 0 0 0
Preview
Trying out Gemini 3 Pro with audio transcription and a new pelican benchmark Plus what happens if AI labs train for pelicans riding bicycles?

Gemini 3, Open AI kv cache and much more
open.substack.com/pub/simonw/p...

4 months ago 1 0 0 0

and also allow you to do kv cache offload to local storage for 24hrs! Also they cache only when the query is greater than 1024 tokens!

4 months ago 0 0 0 0
Preview
OpenAI Platform Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.

Open AI gave some of the details from the user POV as to what kv cache features are available 
platform.openai.com/docs/guides/...

It is interesting to see that they cache for 10 min and if no request is found they remove hot caches from GPU

4 months ago 1 0 1 0
Preview
Microsoft AI superfactory Microsoft unveiled its second Fairwater AI datacenter in Atlanta as part of a new AI superfactory working across states in nearly real time.

From Wisconsin to Atlanta: Microsoft connects datacenters to build its first AI superfactory

news.microsoft.com/source/featu...

4 months ago 0 0 0 0
Satya Nadella – How Microsoft thinks about AGI
Satya Nadella – How Microsoft thinks about AGI YouTube video by Dwarkesh Patel

Satya Nadella – How Microsoft thinks about AGI
youtu.be/8-boBsWcr5A?...

4 months ago 0 0 0 0
Keynote: How One Line of Code Freed 30,000 CPU Cores: Deep-Diving Fluent Bit at Petabyte... F. Ponce
Keynote: How One Line of Code Freed 30,000 CPU Cores: Deep-Diving Fluent Bit at Petabyte... F. Ponce YouTube video by CNCF [Cloud Native Computing Foundation]

How One Line of Code Freed 30,000 CPU Cores: Deep-Diving Fluent Bit at Petabyte Scale www.youtube.com/watch?v=pbOv...

4 months ago 0 0 0 0
KubeCon + CloudNativeCon North America 2025: LLMs on Kubernetes: Squeeze 5x GPU Effic... View more about this event at KubeCon + CloudNativeCon North America 2025

Come see us (me & Yuhan Liu) tomorrow for our talk.

Specifically, Wednesday November 12, 2025 5:30pm - 6:00pm EST at Building B | Level 5 | Thomas Murphy Ballroom 1.

More info: sched.co/27FcQ #kubecon #vllm

5 months ago 0 0 0 0
Advertisement
Preview
Ray Direct Transport: RDMA Support in Ray Core (Part 1) Ray Direct Transport enables fast and direct GPU transfers in Ray via RDMA-backed transports. Using RDT, we can achieve up to 1000x faster GPU-GPU transfers than Ray’s native object store with a few l...

Announcing Ray Direct Transport: RDMA Support in Ray Core
www.anyscale.com/blog/ray-dir...

5 months ago 1 0 0 0

This has become whackamole now, source: www.youtube.com/watch?v=AXN-...

I ran the following command in Mac's terminal to get Chrome working with uBlock Origin:

```
open -a /Applications/Google\ Chrome.app --args --disable-features=ExtensionManifestV2Unsupported,ExtensionManifestV2Disabled
```

5 months ago 0 0 0 0
Preview
Building a tool to copy-paste share terminal sessions using Claude Code for web Plus Living dangerously with Claude, and prompt injection risks for ChatGPT Atlas

Building a tool to copy-paste share terminal sessions using Claude Code for web
open.substack.com/pub/simonw/p...

5 months ago 2 0 0 0
Preview
LMCache: An Efficient KV Cache Layer for Enterprise-Scale LLM Inference Today's LLM inference systems treat individual engines and queries independently for simplicity, but this causes significant resource inefficiencies. While there are proposals to avoid redundant compu...

LMCache: An Efficient KV Cache Layer for Enterprise-Scale LLM Inference
arxiv.org/abs/2510.09665

5 months ago 1 0 0 0