Advertisement · 728 × 90

Posts by v

I have the benefit of having an actress in my life, so I get to watch the process of auditioning.

Auditioning is a crap-shoot. You never really know what the director is looking for. They may have someone in mind already and the whole audition is a courtesy / box-check on some grant program they're operating the theater under. You do your best, you take rejection, you suit up and do it again. She does some additional work: she researches the theaters, keeps her ear to the ground, talks to other actors in the area about their experience (that's easier with actors, where the gigs are one-offs; software engineers aren't moving as fast through industry so you have to build a larger contact web to get a richer picture of who's hiring in your area).

But the biggest thing that controls whether you get a role is if you keep showing up. The role finds you; your control over getting the role (unless you're a big enough name that people recognize it) is minimal.

I know it's not the most comfor

I have the benefit of having an actress in my life, so I get to watch the process of auditioning. Auditioning is a crap-shoot. You never really know what the director is looking for. They may have someone in mind already and the whole audition is a courtesy / box-check on some grant program they're operating the theater under. You do your best, you take rejection, you suit up and do it again. She does some additional work: she researches the theaters, keeps her ear to the ground, talks to other actors in the area about their experience (that's easier with actors, where the gigs are one-offs; software engineers aren't moving as fast through industry so you have to build a larger contact web to get a richer picture of who's hiring in your area). But the biggest thing that controls whether you get a role is if you keep showing up. The role finds you; your control over getting the role (unless you're a big enough name that people recognize it) is minimal. I know it's not the most comfor

redditor explains why job hunting in tech is exactly like auditioning for acting roles

2 months ago 6 0 0 0

In the 90s, Linus Torvalds had a much superior language to write the Linux kernel. But, since he is Finnish, he couldn't Smalltalk

2 months ago 2 0 0 0

I explained about SQLite's neat little trick here about reserved space management per page (5/5)

bsky.app/profile/did:...

5 months ago 0 0 0 0

For example, if we use AEGIS-256 with a nonce size of 32 bytes and a 16-byte tag, you'd need extra space for 48 bytes per page.

During decryption, you read the tag and nonce from the reserved space and provide them to the decryption algorithm. (4/5)

5 months ago 0 0 1 0

The size of this tag and nonce varies by algorithm.

To make space, SQLite uses reserved space per page. Once the page is encrypted, this portion can carry the metadata. (3/5)

5 months ago 0 0 1 0

Since we encrypt each page separately, we should use a different nonce for different pages. Even the same page, when encrypted again, should use a different nonce for better security™️

So during encryption, we generate a secure random nonce every time. (2/5)

5 months ago 0 0 1 0

AEADs provide a verification tag after encryption. For each page, we need a nonce too. Both the nonce & the tag become metadata for an encrypted page

So where do you store them? We could store them separately, but it's much better & neater to store them in the page itself (1/5)

5 months ago 3 0 1 0

Btw, SQLite also has `secure_delete` pragma setting which overwrites deleted content with zeros. (9/9)

5 months ago 0 0 0 0

SQLite uses this neat trick for its "reserved space" feature. Extensions can store any data they want in each page without interfering with B Tree operations.

Extensions like encryption and checksums need to store extra metadata per page, and they use this reserved space. (8/9)

5 months ago 1 0 1 0
Advertisement
page with reserved space

page with reserved space

Now, remember how I said cells start from the rightmost end? If you intentionally leave some space at that end before starting the cells, the B Tree would work exactly the same. That space can contain anything, and the B Tree would never touch it since there's no pointer to it. (7/9)

5 months ago 1 0 1 0

This means garbage data from previous uses can actually get written back to disk. I don't know any other data structure that works like this. (6/9)

5 months ago 0 0 1 0

Calling memset (the API that zeroes every bit) adds latency, so as an optimization, databases might skip it entirely. Similarly, a deletion of cell is just removing the pointer, but the data might remain as it is. (5/9)

5 months ago 0 0 1 0

It could have garbage data sitting around, but the page would never access it since no cell pointer references it.

Databases also maintain a buffer pool, think of it as a cache of pages loaded from disk. These pages often get reused. (4/9)

5 months ago 0 0 1 0

while the cells themselves grow from right to left, meeting somewhere in the middle.

That middle section is logically free space, but here's the interesting part is it can contain any data. Neither the page nor the B Tree cares about what's in there. (3/9)

5 months ago 0 0 1 0

One common way to organize data within a page is the slotted page structure. It starts with a header, followed by a bunch of cell pointers. These pointers reference cells at the end of the page. As you add more data, the pointers grow from left to right (2/9)

5 months ago 0 0 1 0
The page strcture

The page strcture

The B Tree data structure fascinates me. Databases use B Trees to store data on disk, organizing everything into pages that typically range from 4kb to 8kb. All I/O operations happen in units of these pages.

The page looks like this... (1/9)

5 months ago 7 0 1 2
Preview
Query Limits and Timeouts | MariaDB Documentation

seems it works on Maria too

mariadb.com/docs/server/...

5 months ago 1 0 0 0
MySQL :: MySQL 8.4 Reference Manual :: 6.5.1.6 mysql Client Tips

docs here: dev.mysql.com/doc/refman/8...

5 months ago 1 0 0 0

BTW, SQL_SAFE_UPDATES also auto-enables two other protections:

- Limits SELECT to 1,000 rows (no more accidental SELECT * on giant tables)

- Caps joins at 1M row combinations

You can override these with --select-limit and --max-join-size if needed.

5 months ago 1 0 1 0
Advertisement
example usage of mysql safe updates

example usage of mysql safe updates

Pro database tip: enable `SQL_SAFE_UPDATES` in MySQL to avoid accidental UPDATE/DELETE queries without a WHERE clause.

It forces you to use a key or a LIMIT, instead of wiping whole database by mistake at 2:19am.

5 months ago 17 2 2 0

PS: I'm taking some liberties calling these "shards" — they're really just isolated databases. But I'd still call it a single logical db since they share a schema and present a unified interface, even if you can't query across them.

5 months ago 0 0 0 0

You'd have to handle schema management manually or do it in the application layer. You also can't do mass migrations easily.

I call this Limitless Sharding.

5 months ago 0 0 1 0

Even on a tiny machine, I can run millions of SQLite databases.

The downsides: You'd only need this at "scale". The entire architecture assumes you don't need cross-shard queries. So this only works for applications where each document is truly an individual entity.

5 months ago 1 0 1 0

Since some of these embedded databases can run as WASM in the browser, each document could sync its own file directly with a backend db with some CRDT.

I'd bet this model can handle way more writes per second than a sharded Postgres/MySQL setup.

5 months ago 1 0 1 0

A Figma document could just be stored as a SQLite file in the backend. You'd also need libSQL or SQLite with lightstream for backup/replication.

5 months ago 1 0 1 0

If each document is its own database, it can easily handle all requests without breaking a sweat.

Running a million Postgres or MySQL instances sounds crazy and is total overkill. But embedded databases like SQLite, lmdb, and RocksDB fit this perfectly. They're just files!

5 months ago 0 0 1 0

What if we ran disconnected databases that all acted as a single logical db?

Think of applications like Notion, Figma, or Google Docs — each document is disconnected from the others. The key insight: each document gets very few write requests since users are making changes manually.

5 months ago 0 0 1 0
Advertisement

Usually there's a proxy that routes queries to the right shards and does aggregation for cross-shard queries. Even though the db is split across multiple machines, it's still logically a single database. They're all "connected".

But...what if we flipped this around?

5 months ago 0 0 1 0

Sharding. Database sharding is one of the common techniques to scale a database horizontally. You split the db into small parts called shards and distribute them across machines.

Shards are typically in the few hundreds or even thousands (for extremely large databases).

5 months ago 3 0 1 0

on orange site - news.ycombinator.com...

5 months ago 0 0 0 0