Its layers of deadlock safety in my mind
1. compile time
2. pre-declaration (e.g. LockSet): guaranteed to succeed at runtime.
3. Then runtime ordering.
My thought is fundamentally 2 should be built on 3 - a generic ordering mechanism. To let users have a hatch to (3) when necessary.
Posts by Jordan Isaacs
Yes to user orderings. Since I didn’t think of the LockSet/pre-declaration api, no auto assignment (yet at least). auto assignment isn’t useful without the pre-declaration API. Since app doesn’t know what order to acquire itself.
Under the hood this works by overloading the ZST lock guards, the ordered lock guard stores the ZST + the last key. Thus the guards represent a "stack" of ordering keys. Allowing for releasing (popping) and acquiring (pushing) ordered values into the runtime lock stack. codeberg.org/snowytrees/l...
To demonstrate this I open sourced my version (it implements the lock guard suggestion from the original lock-ordering gh issue + ordered map). See for ordered map use case codeberg.org/snowytrees/l...
I think you can hide this from non-advanced users by doing a default impl with auto-ordering.
Not exactly, for _runtime_ ordering within a level, I am suggesting a slight adjustment for a more flexible API. Let the application provide custom ordering - thus (with less safety) now one can acquire locks one at a time with its known order with panic safety. `LockSet` can be a safer API on top.
It’s more I saw a lot of parallels in the design with this issue from that project (filed by one of fuschia engineers). github.com/akonradi/loc...
Not knocking the post and crate it’s awesome!
The LockSet idea is really cool, but misses out on some flexibility. My understanding is you pre-declare the locks you need (LockSet does ordering internally). But it’s common to not know what locks you need upfront. For this the application needs to be in control of ordering.
Also this don’t work for runtime locks. E.g Map<string, Lock<value>>. I came up with the “OrderedLock”. You specify it logically at compile time to fit into the DAG. At runtime each lock has an ordered key. Then it asserts ordered acquisition - which fits within compile time enforced ordering
Unfortunate github.com/akonradi/loc... wasn’t included as prior art as well. I have vibe coded something similar based on it, with a proc macro for specifying the DAG.
You're probably right to pick a modern extension to support vector similarity search in Postgres. But did you know Postgres already has one built in?
I took a look at the cube extension in Postgres, pgvector, and model2vec for some impressively fast embeddings generation.
Just felt the earthquake in sf, got a short little ride on my couch 🎢
Finished the overhaul of kernel-development-flake ✨ inspired by virtme-ng which didn’t support NixOS. kdf is also a rust-based /init using virtiofs. You can now do fast live dev of kernel modules/in-tree, no more waiting for nix rebuilds github.com/jordanisaacs...
TIL rust's spawn implementation uses a unix socket for communication between child & parent (github.com/rust-lang/ru...). Only happens when it can't use posix spawn (e.g. using pre_exec callback). Hit this when running a kernel without unix sockets in qemu and got function not implemented 🫠
PostgreSQL 14 internals book
Entering my Postgres era 🐘
Well looks like they didn’t read up on the history of using io-uring safely in rust. They have the main io-uring issue of sharing buffers with the kernel. github.com/Azure/kimoji...
The classic from without boats (2020) without.boats/blog/io-uring/
Omg they are using my rustix-uring library, looks like I got to get back on the maintenance grind 😅 github.com/jordanisaacs...
And page servers are sharded (one copy stored on object storage) allowing for easy scaling up/down of page servers. The standard `GetPage @ LSN` API.
They have automatic failovers of primary to read replicas (and if old primary comes back online it will come back as a read replica). Since compute nodes are stateless this is easy for them to do. A mention of ringfencing but no details on it. Video has a nice animation of this.
HorizonDB write path diagram.
WAL writes are majority quorum from the primary. They are using rust & thread per-core io-uring on nvme. A new open source runtime, kimojio! github.com/Azure/kimoji.... Excited to dig into that.
HorizonDB high level architecture diagram.
Some interesting stuff in the talk. It’s neon style durable WAL log & async WAL replay to create pages. All backed by object storage. There is SSD cache on all nodes.
Today's Future Data Systems Seminar Speaker: Joyo Victor will present @singlestore.com's "Bottle Service" meta-data system that supports database branching, change-data-capture, and Apache Iceberg. Zoom talk open to public at 4:30pm ET. YouTube video available after: db.cs.cmu.edu/events/futur...
So after a day of work I had my 3 line fix :) Add 'SHF_GNU_RETAIN' flag to the inline asm for the main flag list. And a dummy variable in the macro that uses the view sections to force a GC reference:
`__attribute__((retain,used,section("view-section")))
static const char dummy[0];`
Of course this is where I ended up at maskray’s blog doing deep reads of linker GC (maskray.me/blog/2021-02... ) and metadata sections (maskray.me/blog/2021-01...). I realized the main flag list should always exist and the view should only be kept if in use.
There has actually been a ton of consideration to handling these sort of “meta” sections that have no direct access and thus do not participate in general GC references. But upon further looking the dynamic flag didn’t use any of the special flags. So I found my bug, the linker was GCing it
Luckily I had two build flavors with one passing and the other failing. I manually went through every linker flag difference to guess which one was causing issues. And '—gc-sections' looked suspicious. The linker has some crazy complexity to how it decides to GC sections.
So dynamic flag automatically generates sections using inline assembly. One section has the list of all flags. Then there are arbitrarily many other sections that are a “view” into that list of flags for more efficient lookups. There is no access of any sections beyond using '__start' and '__stop'
But the sections existed in the object files leaving me flummoxed. Why wasn’t the linker using the sections? So I went down one of the most interesting systems rabbit holes I’ve had recently.