Ilia Gusev (@persikbl) Bsky

Next level: Thanos downsampling.

30d raw → 1y 5-min rollups → 5y hourly. Your long-retention S3 bill drops by 20-240x. Historical queries get faster.

Full FinOps playbook:
podostack.com/p/prometheu... 🛠️

15 hours ago 0 0 0 0

The fix is labeldrop:

metric_relabel_configs:
- action: labeldrop
regex: pod_template_hash

One line. Cuts your series count by 5x on a busy cluster. Doesn't require touching any application code.

15 hours ago 0 0 1 0

The usual culprits: pod_template_hash (changes every deploy), request_id (unique per request), user_id, git_commit.

Each of these turns a reasonable metric into a cardinality bomb. Most of them should be dropped at scrape time, not stored.

15 hours ago 0 0 1 0

Find the offenders in 5 seconds:

topk(20, count by (__name__)({__name__!=""}))

This ranks metrics by how many series they generate. The top 20 usually contains 80% of your total. Write down the names - that's your attack list.

15 hours ago 0 0 1 0

Cardinality = unique time series. Not bytes. Not datapoints. Series.

Every unique combination of labels is a new series. Add one label with 1000 values and your 50-series metric becomes 50,000 series. Prometheus loads all of them into RAM.

15 hours ago 0 0 1 0

Your Prometheus memory keeps climbing and nobody knows why.

You're not out of metrics. You're out of cardinality. And it's costing real money - CPU, RAM, S3.

podostack.com/p/prometheu...

15 hours ago 1 0 1 0

The combo - weighted NodePools + broad instance flexibility + PDBs - turns spot from "dev only" into "run literally everything."

Full spot pattern with YAML:
podostack.com/p/karpenter... 🛠️

16 hours ago 0 0 0 0

Pair this with PodDisruptionBudgets on critical workloads and Karpenter handles spot interruption gracefully:

1. AWS sends interruption notice
2. Karpenter cordons the doomed node
3. Drains within PDB limits
4. Provisions replacement

You don't even notice.

16 hours ago 0 0 1 0

The subtle trick: instance flexibility.

Lock spot to one family (c6i only) = fragile, one pool exhausts, fallback fires.

Open it to categories c, m, r across generations and architectures = Karpenter has dozens of pools to pick from. Interruption rates drop.

16 hours ago 0 0 1 0

Your bill benefits when spot is plentiful (70% discount). Your uptime benefits when spot is scarce (seamless fallback). The pod doesn't know the difference.

One YAML pattern. Works for anything stateless.

16 hours ago 0 0 1 0

Two NodePools:

spot-pool → weight: 10 (lower = higher priority)
ondemand-pool → weight: 20 (fallback)

Karpenter tries the lowest weight first. If spot capacity isn't available, it immediately falls through to on-demand. The pod never sits Pending.

16 hours ago 0 0 1 0

Teams avoid spot in production because "what if capacity runs out?"

The answer isn't "don't use spot." The answer is two NodePools and the weight field.

podostack.com/p/karpenter...

16 hours ago 0 0 1 0

Five patterns. Real production scenarios. YAML you can ship today.

podostack.com/p/karpenter... 🛠️

1 day ago 1 0 0 0

What's inside:

- NodePool + EC2NodeClass: the responsibility split
- Spot-to-On-Demand fallback with weights
- TopologySpread: the DoNotSchedule trap
- SpotToSpot consolidation
- Why Descheduler is an anti-pattern with Karpenter

1 day ago 0 0 1 0

New Podo Stack just dropped.

This week: Karpenter Beyond Basics. Five patterns that separate a demo-grade setup from one that actually saves you money in production.

podostack.com/p/karpenter...

1 day ago 0 0 1 0

nodeSelector for the simplest cases. nodeAffinity for everything else. The five extra lines of YAML save you from Pending pods at 3 AM.

Full guide with YAML examples:
podostack.com/p/kubernete... 🛠️

2 days ago 0 0 0 0

Anti-pattern: using nodeSelector for zone spreading.

If us-east-1a runs out of capacity, your pods sit Pending. With preferredDuringScheduling you get zone preference without the deadlock. The scheduler does its best but doesn't block.

2 days ago 0 0 1 0

The killer combo: required + preferred together.

Required: "must be amd64 OR arm64"
Preferred: "prefer arm64 with weight 80"

The scheduler places pods on ARM when available (cheaper), falls back to AMD when ARM is full. One spec. Graceful degradation.

2 days ago 0 0 1 0

nodeSelector is a hard match. Label exists = schedule. Label missing = Pending forever. No fallback. No preference. No nuance.

nodeAffinity gives you two modes:
- requiredDuringScheduling (hard rule, like nodeSelector but with operators)
- preferredDuringScheduling (soft preference with weights)

2 days ago 0 0 1 0

There's a moment in every Kubernetes journey where nodeSelector stops being enough.

You need GPU nodes for ML. You're migrating to ARM64 with a fallback. You want "prefer this pool, but don't crash if it's full."

nodeSelector can't do any of that.

podostack.com/p/kubernete...

2 days ago 1 0 1 0

Full deep dive on how Temporal kills the state hell - Workflow vs Activity, event sourcing replay, signals, retries, versioning in prod.

podostack.com/p/temporal-... 🛠️

4 days ago 0 0 0 0

The sleep isn't sleeping your process. It's a durable timer on the Temporal server.

Pod crashes, deployment rolls out, region fails over. The timer fires on schedule. The workflow picks up exactly where it left off.

No cron. No state flags. No in-flight migrations.

4 days ago 0 0 1 0

Temporal flips the model.

You write the workflow as one sequential function. workflow.Sleep(24*time.Hour). workflow.WaitForSignal(...). It looks like code that ignores failure.

The platform handles the durability.

4 days ago 0 0 1 0

This is the state management hell.

The business logic gets smeared across your infrastructure. Debugging means grepping five systems. Changing "wait 3 days" to "wait 4 days" needs a migration for in-flight state.

Testing it? Mock time and half your stack.

4 days ago 0 0 1 0

Where does "wait 24 hours" actually live?

Cron polling a DB every minute? Delayed queue that drops jobs on broker restart? State flag column with a scheduler + retry table + dead letter queue?

Six lines of logic. Five systems. Zero sleep for the on-call engineer.

4 days ago 0 0 1 0

You've written this workflow before.

Register user → send email → wait 24h → remind → wait 3 days → bonus or nudge.

Easy in your head. A nightmare across cron, queues, and state flags.

podostack.com/p/temporal-...

4 days ago 0 0 1 0

Bonus trick: create a NEW index as invisible first, run it in prod on a replica, then flip visible if metrics improve.

A/B testing for index design, zero rollback cost.

Full guide:
podostack.com/p/invisible... 🛠️

5 days ago 0 0 0 0

The catches:

Primary keys and UNIQUE indexes can't go invisible (they enforce constraints, not just reads).

Writes still hit the index - so this tests the READ path, not the write path. If you want to reduce write amplification, you still have to drop.

5 days ago 0 0 1 0

The workflow:

1. Make the index invisible
2. Wait 24-48 hours, watch p95 and slow query log
3. If nothing broke - DROP INDEX for real
4. If something broke - flip it back

It turns "risky DDL" into "safe experiment."

5 days ago 0 0 1 0

Nothing is faster to roll back.

If something regresses, flip it back:
ALTER TABLE orders ALTER INDEX ix_user_status VISIBLE;

Also instant. No CREATE INDEX on a 50M row table. No locking. The index was never physically gone.

5 days ago 0 0 1 0

Posts by Ilia Gusev