Just dropped a follow-up to my shell scripting guide with more techniques based on feedback from the orange site.
Turns out when you ask HN about shells, they don't just complain- they actually have solid advice. From printf vs echo to the mysteries of double brackets.
nochlin.com/blog/6-techn...
Posts by Jason Nochlin
Working on a follow-up to my shell script UX post. What's the most impressive shell script UX feature you've encountered in the wild?
Shell scripts don't have to be cryptic nightmares. The 6 techniques I use to make mine actually pleasant to work with:
nochlin.com/blog/6-techn...
"How Variable-Increment Counting Bloom Filters Work"
Enjoyed reading this post by @nochlin.com about VI-CBFs, a data structure for efficiently checking set membership, with support for deletions and a reduced false positive rate compared to regular CBFs.
π nochlin.com/blog/variabl...
Thanks for the share @gunnarmorling.dev βΊοΈ
Just wrote a blog post that covers two of my favorite, underutilized software engineering topics: bloom filter variants and using mathematical insights to optimize data structures.
The post summarizes a 2012 paper "The Variable-Increment Counting Bloom Filter"
nochlin.com/blog/variabl...
I'm not sure if there's a cleaner way, pretty new to the tool π
from dagster import job, op, Definitions, SkipReason, schedule from datetime import datetime import requests @op(retry_policy={ "max_retries": 3, "delay": 60 # 1 minute between retries }) def check_api_state(): response = requests.get("https://api.example.com/status") data = response.json() # Return None if conditions aren't met, which will skip downstream ops if data["status"] != "READY" or data["queue_size"] > 100: return None return data @op def process_data(context, api_state): # This won't run if api_state is None if not api_state: context.log.info("Conditions not met, skipping processing") return # Your processing logic here context.log.info(f"Processing data with state: {api_state}") @job def conditional_job(): state = check_api_state() process_data(state) @schedule( job=conditional_job, cron_schedule="0 * * * *" # Run hourly ) def hourly_schedule(): return {} defs = Definitions( jobs=[conditional_job], schedules=[hourly_schedule] )
Here's an example of a simple pipeline configured with Dagster
I recently evaluated orchestration tools and landed on Dagster b/c it seemed the simplest (can run on a SQLite DB) and is configured with code. Planning to use the open-source version and host it myself, but haven't implemented yet.
1. Quickly prototype data transformation in Python using DuckDB + Python Functions
2. Use LLM to rewrite Python Functions to C
3. Enjoy 200x performance improvement π
@joshuawood.net thanks for the awesome sticker! Goes great with my jiu jitsu griffin π₯