YouTube: www.youtube.com/watch?v=-zAh...
GitHub (slides & code): github.com/colimit-ai/t...
Dafny: dafny.org
Posts by larrytheliquid
I hope this can act both as a quick intro to Dafny, as well as reference material to understand what's going on when you move beyond the basics.
This talk is on a language called Dafny, which IMO does the best job of making formal verification accessible to programmers without previous experience in the area.
It's been over a decade since I've given a public tech talk, so I was super happy when @eatonphil.bsky.social said there was an opening at NYC Systems.
Iโll be available in SF this Tuesday evening and any time Thursday. Would be great to meetup with formal verification minded folks there :) DMs open
LLM's provide the automation of doing the translations between the languages, and formal verification checks their work.
The magic is formal proofs of semantics preservation when doing this translation, which is mediated by using Boogie as its intermediate representation language (amenable to formal verification via SMT solvers).
This MSR et. al. research is a great step in this direction. It shows how you can start with Python/Pandas as your unambiguous pseudocode intent language, and compile to inlined SparkSQL for performance.
Paper: dl.acm.org/doi/10.1145/...
1. Ambiguous but flexible natural language descriptions of intent
2. Unambiguous pseudocode to review and iterate on
3. Synthesis/compilation to your implementation language, formally verified to "refine" your pseudocode
In the age of LLM's program synthesis is now mainstream. Currently everyone is cowboy-coding straight from natural language directly to implementations. But, to get consistency and a lack of bugs what you really want is:
thats cool! one other benefit (if youre into that kinda thing), is you get static artifacts in CI that are easy to feed into LLM's for debugging assistance too
nice, are there going to be any mendler-style encodings?
Now when Colimit autofixes a failed GitHub action, but you want the fix done slightly differently, you can give it feedback and regenerate ๐
I recently got colimit.io to support GitHub personal/user accounts with the same level of feature parity as organization accounts, so feel free to give it a spin on personal projects :)
There's also a discord channel for feedback or casual talk about auto-formalization: discord.gg/2jkP3m3Rhv
Interesting! Yeah I'm hoping that long term, LLM-powered auto-modeling/auto-formalization can finally make the fruits of modeling get absorbed by the world without needing to do the dirty work of planting the seeds etc :)
Thanks, originally it hurt to see ppl get excited about a part of a demo that i thought was just added value, compared to the parts that i was personally more invested in and were more difficult to make.
But once it happened enough times, it was too big to ignore from a business standpoint.
As our auto-modeling takes shape, we will expand from shallow bug fixing to deep bug fixing, using formal models as our technical leverage. Please join us by checking out the site, demo, or joining our discord :)
Site: colimit.io
Demo: www.loom.com/share/23aad3...
Discord: discord.gg/2jkP3m3Rhv 9/9
Our current autofixer integrates with GitHub Actions and I'm excited to announce that we just launched it for Early Access!
Right now it's mostly useful for tedious but frequently occurring bugs, like linting failures or basic unit or integration test misaligned expectations. 8/9
The mission is the same, increasing software reliability by popularizing [semi]formal methods, but this prioritization allows it to grow as a business. 7/9
So we made a hard choice and reversed our roadmap:
1. First auto-fixing as as a service
2. Auto-modeling. Models act as "context compression" of a codebase, and can be checked for logical consistency w/o builds & running tests.
3. Reintroduce bug-finding, now palatable via auto-generated models. 6/9
Luckily, we noticed that during our demos one part of our product lit up people's eyes: The auto-generated "debugging notes", and the associated auto-fixer of the bugs that Colimit's bug-finder identified. With LLM's these days, this part is a lot easier to get off the ground. 5/9
For business uptake, any amount of extra work to be done quickly kills adoption/onboarding, even if the results are great. The answer was clear: Use LLM's to auto-model/auto-formalize: But that gives Colimit as a company even more work and less runway. 4/9
We found real bugs for our design partners, giving them room to refactor with confidence. At first we wrote the models for them, but when trying to hand off the modeling we learned a hard lesson. Even if modeling in our DSL is approx the same effort as writing tests, it's still something new. 3/9
The initial goal was to make model-checking and model-based testing (and even formalization) more widespread by creating an accessible TypeScript-like modeling language, paired with a cloud that executed verification and tests at scale. 2/9
Hi everyone, I'm making a low-key version of this post here on bsky first: my startup @colimit.bsky.social recently pivoted from being a bug-finding service to a bug-fixing service for failed CI builds (initially GitHub Actions): colimit.io.
๐งต on why and sharing appreciated! 1/9