It turns out that the 'weakest' tic-tac-toe opening is actually the best one — if your opponent occasionally makes mistakes.
New post on what games can teach us about coordination cost in distributed systems: jhellerstein.github.io/blog/two-com...
Posts by Joe Hellerstein
📄 New blog post!
AI will soon write most distributed code.
Distributed code is where our worst bugs —Heisenbugs— live.
The real lever isn’t “test more,” it’s **aim better**: AI should target frameworks where correctness contracts are explicit and checkable.
jhellerstein.github.io/blog/codegen...
Let’s collaborate on democratizing insights from tabular data in Amsterdam! ✨
PhD directions: 1) fundamental techniques for tabular foundation models, 2) reliable mechanisms for AI-powered tabular data analysis.
Sharing w/ friends appreciated! ⬇️
The last blog post in my miniseries on CRDTs is up!
jhellerstein.github.io/blog/crdt-in...
Mix of pragmatism and formalism.
There's actually a small result in there that may be novel: Strong Eventual Consistency !=> Determinism. Curious to hear whether they've seen this result elsewhere.
posted today!
BTW I peeked at the automerge Rust? Collaborative editing is an example where one probably *has* to resort to unsafe behavior (you're the expert there!) so I'm mostly advocating for more encapsulation/comments in that case.
jhellerstein.github.io/blog/crdt-do...
Next blog post in the CRDT Series is up!
This one is for the developers... stay safe out there, folks.
jhellerstein.github.io/blog/crdt-do...
Good thread. Thoughtful as always.
Really early and well seen, definitely influenced me and my team! Hats off.
Depends what you want the “set of lists” semantics to mean. I’d think you likely want a 2P-map lattice of RGAs (2P-map would be like a 2P-set but with a lattice value associated with each unique item in adds). If you want more detail please comment in the blog so it’s easier for others to find it.
There are simple and helpful composites that can be written generically and reused safely. E.g. lattice pairs (free or lexical) and Map lattices. Helps to have a language with good support for generics (parameterized types).
(Catching up to my LI feed).
Next blog post is out! This is the first real post in a short series on CRDTs, an idea that has some currency in the distributed programming community, but one that comes with a number of sharp edges. Be careful out there!
jhellerstein.github.io/blog/crdt-tu...
Blog relaunch! Bbye wordpress, hello github.
If you're into SW dev, cloud, databases, distributed systems, automatic codegen ... or data and CS in general... check it out.
As a warmup, I'm starting with a series of posts on CRDTs. Intro post up now: jhellerstein.github.io/blog/crdt-in...
Wow! @arvind.bsky.social giving an awesome keynote including discussion of VegaExpress and GoFish interactive vis libraries from his group. #EPICRetreat #UCBerkeley.
Here’s a provocative example from JD Zamfirescu-Pereira on ways that humans and LLMs can get misaligned on expectations. Is the LLM lying? Is it just emitting tokens? How do people interpret this? #EPICRetreat #UCBerkeley.
The SF Systems Meetup is back! On 2/27, we're excited to have headline talks from the creator of FizzBee and a research collaborator with Signal. This is going to be a super fun night diving deep into making distributed protocols work, hope you'll join us! lu.ma/vqjf30k3
GPT4o shows that f(a,b) = (a+b)/2 is an example of a commutative function that is not associative.
GPT4o did better:
GPT4 asserts that Min and Max functions are commutative but not associative, but then checks itself and backtracks.
The question: "what are examples of commutative functions that are not associative?"
GPT4 was funny, thinking aloud and then proving itself wrong:
In some kind of sad watershed, today was the day as a professor when I live-ChatGPT'ed the answer to a question in a Zoom with my PhD student and his undergrad mentees.
But hey, let's paint it in a positive light: this was a demonstration of using the right tool at the right time.
Operationalizing Machine Learning: An Interview Study by @joehellerstein.bsky.social, @adityagp.bsky.social, et al. Particularly love the part on "Retrofitting Explanations".
#MachineLearning #MLOps #Datascience.
arxiv.org/pdf/2209.09125
I think “getting all of your coordination under one roof” (or behind a unified api or something) is the message I’m hearing from you. Don’t know if that helps?
A muddled post at best. A sequential log *is* a point of coordination. It doesn't avoid coordination as claimed, it just centralizes it in 1 service (and arguably encourages overuse). Coordination avoidance is orthogonal: discover when global ordering is not needed. Ie avoidance avoids the log!
Sunset in #Berkeley these days is a perfect field goal over the golden gate bridge. Shifts quite a ways north during the summer.
"Whats new in Excel" dialog box. The text says "Data Aggregation Functions: We've added two incredibly powerful new data aggregation functions: GROUPBY and PIVOTBY"
2025. What a time to be alive!
Fickle faculty followup follies
It’s incredibly beautiful that President Carter is our emissary on a Voyager probe. His words live on across our galaxy!
The culture in my community in CS has long been to share course materials openly. My lecture videos+notes are all posted public online, as are those of many of my peers. If anything there's some competition for attention.
No judgement implied, just interesting difference in community norms.
Thrilled to share that our paper “Flo: A Semantic Foundation for Progressive Stream Processing” (with @mpmilano.bsky.social, Alvin Cheung, and @joehellerstein.bsky.social) will appear at POPL 2025! Check out the preprint at arxiv.org/abs/2411.08274, and read on for more!
An egret walking in the San Francisco Bay with the sunset behind the Golden Gate Bridge
Silhouettes of people by the San Francisco Bay at sunset with the Golden Gate Bridge in the background
San Francisco Bay at sunset with the Golden Gate Bridge
Sunset over SF looked promising again today so we went down to the bay to take it in.