Nice example of a production #vLLM setup on ๐ก๐ฒ๐ฏ๐ถ๐๐ with terraform, managed K8s, inference, and observability all in one place.
This can be a ref stack builders can use without reinventing the basics ๐ก.
๐จ๐ปโ๐ป full code on our repo.
github.com/CloudThrill/vllm-production-stack-terraform
Posts by CloudThrill
๐And our #1 - 2025 blog-post on @Cloudthrill isโฆ KV Cache explained (๐๐ถ๐ธ๐ฒ ๐โ๐บ ๐ฑ)
Ever wondered what #KVCache really is in LLM inference?
Here's the simplest analogy for beginners plus an overview of popular KV cache optimization techniques!
๐ cloudthrill.ca/kv_cache-exp...
๐Ranked #2 most-read in 2025 - #vLLM for Beginners (Key features)
2๏ธโฃ Hereโs the most exhaustive list of VLLM features you wish you knew. ๐
๐ cloudthrill.ca/what-is-vllm...
Learn what makes #vllm the ๐ฅ๐ผ๐น๐น๐ ๐ฅ๐ผ๐๐ฐ๐ฒ of Inference in productionโจ. #vLLM #AIForBeginners
๐Ranked #3: ๐๐๐ ๐ค๐๐ฎ๐ป๐๐ถ๐๐ฎ๐๐ถ๐ผ๐ป, because cheaper inference is never just about INT8 vs FP16.
Hereโs everything you wish you knew about LLM quantization.๐
๐ cloudthrill.ca/llm-quantization-all-you-need-to-know
๐๏ธPodcast (YouTube):"from ๐๐๐๐ ๐ต๐ฐ enterprise ๐๐ถ๐ข๐ฏ๐ตization"โฅ๏ธ
๐บ youtube.com/watch?v=XTE0oS7b6fM
๐This week, weโre counted down CloudThrillโs ๐ง๐ผ๐ฝ ๐ฏ ๐บ๐ผ๐๐-๐ฟ๐ฒ๐ฎ๐ฑ blog posts of ๐ฎ๐ฌ๐ฎ๐ฑ๐.
Three posts. Three lessons. First reveal drops this Monday ๐
#CloudThrill #LLM #AIInfrastructure #LLM #OpenSourceAI #AIEngineering
CloudThrill is a proud sponsor of Tech Beats Unplugged podcast๐๏ธ. ๐ฅNew episode out with Michael (WebScale) Webster- breaking down the VMwareโBroadcom chaos, Nutanix , and real exit strategies. Listen now๐ง๐๐ผ.
This terraform stack delivers a production-ready vLLM serving environment On @awscloud.bsky.social #EKS, supporting both CPU/GPU inference with operational best practices embedded in AWS Integration and Automation (๐ฎ๐๐-๐ถ๐ฎ). A One Click Deploy๐ฅCheck the repo and Blog below ๐๐ป
This terraform stack delivers a production-ready vLLM serving environment On ๐๐ผ๐ผ๐ด๐น๐ฒ ๐๐น๐ผ๐๐ฑ ๐๐๐, supporting both ๐๐ฃ๐จ/๐๐ฃ๐จ inference with operational best practices embedded in #Terraform #GKE Module. A One Click Deploy๐ฅCheck the repo and blog below ๐๐ป
๐ฅCheck out whatโs cooking in #vLLM for ๐ฎ๐ฌ๐ฎ๐ฒ and beyond. From the project leader himself ๐ฆ๐ถ๐บ๐ผ๐ป ๐ ๐ผ! #๐ข๐ฝ๐ฒ๐ป๐ฆ๐ผ๐๐ฟ๐ฐ๐ฒ๐๐ #LeadingTheway ๐ช #RaySummit2025 #Anyscale
Still thinking of hosting your on AI Backend?
our FREE vLLM POC is still live - but not forever.
๐ข๐๐ฝ๐ฝ๐น๐ ๐ป๐ผ๐ โ cloudthrill.ca/ai-poc
Run AI assistants, RAG, or open models privately in the cloud:
โ
No external APIs
โ
No vendor lock-in
โ
Total data control
Your Infra. Your Models. Your rules.๐๐
๐ก In this 5-min read you'll learn:
โ
How embeddings work โ in the simplest way possible
๐ Chunk sizes, overlaps, and text splitters
๐ฆ Vector DBs, popular embedding models used today
๐กOh,& donโt forget, our Private ๐๐ ๐๐ป๐ณ๐ฒ๐ฟ๐ฒ๐ป๐ฐ๐ฒ campaign is still running, with a ๐๐๐๐๐๐๐ FREE ๐๐๐ cloudthrill.ca/ai-poc
๐Weโre excited to share that CloudThrill has been awarded a ๐๐ซ๐จ๐๐๐ซ๐ฏ๐ข๐๐๐ฌ ๐ฉ๐ซ๐๐ช๐ฎ๐๐ฅ๐ข๐๐ข๐๐๐ญ๐ข๐จ๐ง๐๐ผ with ๐๐ฎ๐๐ฅ๐ข๐ ๐๐๐ซ๐ฏ๐ข๐๐๐ฌ ๐๐ง๐ ๐๐ซ๐จ๐๐ฎ๐ซ๐๐ฆ๐๐ง๐ญ ๐๐๐ง๐๐๐ !
๐๐ป Work with a ๐ฉ๐ฎ๐๐ฅ๐ข๐ ๐๐ ๐๐ง๐๐ฒ ? Letโs talk about your challenges - weโd love to hear from you! cloudthrill.ca/contact-us
#CloudThrill #ProServices #GovernmentOfCanada
Check out our new ๐ฏ๐๐๐ ๐๐ซ๐จ๐๐ฎ๐๐ญ๐ข๐จ๐ง ๐๐ญ๐๐๐ค blog๐๐ผ
๐We cover:
โ
What is ๐ฏ๐๐๐ production stack ?
โ
Request Flow & Architecture breakdown
โ
Serving Engine, Request Router & KV-Cache Netwrk
โ
Autoscaling & built-in fault-tolerance
โ
One-click Helm install
#LLMs #Kubernetes #Cloudthrill #vLLM
Hereโs a full recap from #CloudThrill team of the vLLM beginners series, broken in 3 parts ๐ซ share and enjoy!
๐ Learn the key to easy and production-grade secret management on K8s ๐๐ผ
๐๐ข๐ค๐ ๐ญ๐ก๐ข๐ฌ ๐ค๐ข๐ง๐ ๐จ๐ ๐ฌ๐ญ๐ฎ๐๐? Subscribe here ๐ tinyurl.com/CloudThrillBlogs
Get your teams to level up their CI/CD skills with this GithubActions cert guide ๐๐ป
#NewBlog: final part of our #VLLM blog series๐ฅ
๐This, we shift from theory to practice, covering #vLLM installs across platforms? check our new blog, where we break it down in 5 sections๐#TherYouGo
New week, new blog! ๐๐ผ
๐๐ผSee you next Thursday! #Livestream #LLMs #Quantization
#NewBlog part 2 of our #VLLM blog series ๐ฅ
๐What makes #VLLM the Rolls Royce of inference? ๐๐ปcheck our new blog, where we break it down in 5 performance-packed layers๐#TherYouGo
๐#NewBlog ๐๐ข๐ญ๐๐ฎ๐ actions Azure deploy with ๐๐๐๐!
๐กOver ๐๐ ๐ฆ๐ข๐ฅ๐ฅ๐ข๐จ๐ง๐ฌ secrets were exposed in #GitHub last year๐ & ๐๐K+ #Huggingface tokens leaks every month!
๐ก๏ธSwitch to ๐ฌ๐๐๐ซ๐๐ญ๐ฅ๐๐ฌ๐ฌ with Pipeline identity now!
๐We show you how: cloudthrill.ca/github-actio...
#Azure #NHI #CICD #Terraform #ManagedIdentity
Want to learn about @VLLm ? start here ๐๐ป
๐จ As proud sponsors, we're excited to share the latest episode of #TechBeatsUnplugged! ๐งTune in as Steve Giguere digs through every attack vectorโข๏ธon your GitHub workflows and how to protect you๐ก๏ธfrom them.
New Blog drop!! ๐๐ป
๐ง Your AI workloads are nothing without securing credentials.
๐จ@CloudThrill is excited to announce its membership in the NVIDIA Inception Program! ๐๐ป๐๐ป๐๐ป
Read full statement: cloudthrill.ca/cloudthrill-...
#NVIDIAInception Program for Startups!
๐จ#AI & #CyberSec heads in #Toronto!
Join us on Wednesday, ๐๐๐ฒ ๐๐ญ๐ก from 5:30pm-8pm EST for another exciting #TAICO Meetup (Toronto AI and Cybersecurity Organization).
#Cloudthrill #ProudSponsor๐ฅ
www.meetup.com/taico-toront...
Check out our team's new article how to Ace your #CNCF Certified Kubernetes Administrator exam๐ฅ #CKA
๐ง ๐๐๐ ๐๐จ๐๐๐ฅ๐ฌ ๐๐จ๐ซ ๐๐ฎ๐ฆ๐ฆ๐ข๐๐ฌ #cheatsheet
๐ค If youโve opened #ChatGPT lately and thought:
โ๐๐๐ข๐ญโฆ ๐ฐ๐ก๐๐ญโ๐ฌ ๐จ๐? ๐๐ง๐ ๐ฐ๐ก๐ฒ ๐๐ซ๐ ๐ญ๐ก๐๐ซ๐ ๐ฌ๐จ ๐ฆ๐๐ง๐ฒ ๐ฆ๐จ๐๐๐ฅ๐ฌ ๐ง๐จ๐ฐ?โ Youโre not alone. Today #openAI finally answered๐๐ปโโ๏ธ
๐๐ปhttps://platform.openai.com/docs/models/compare
Check out our team's new article how to expose you web apps demos through Zero trust ๐ @openziti.bsky.social
๐ขNext week, learn more about
๐ง ๐๐ซ๐ข๐ง๐ ๐ข๐ง๐ ๐๐ ๐๐ง๐๐๐ซ๐๐ง๐๐ ๐๐ฅ๐จ๐ฌ๐๐ซ ๐ญ๐จ ๐ฒ๐จ๐ฎ๐ซ ๐๐๐ฏ๐ฌ: deploy ๐๐ ๐๐ง๐๐ฉ๐จ๐ข๐ง๐ญ๐ฌ ๐ข๐ง ๐๐๐ from our own๐ฅOracle #ACE @clouddude.bsky.social
โ ๏ธCheck out the entire agenda ๐๐ป social.ora.cl/60170pPYS
โ ๏ธ Register for free ๐๐ป social.ora.cl/60120pPYa
@oracleace.bsky.social #AIInference #K8s #ollama