Advertisement · 728 × 90

Posts by Garik

Video

Love it. Software is so fun.

2 weeks ago 0 1 0 0
Video

Listers is low-key loaded with good software advice 🦉🐦🦢🦜

4 months ago 1 0 0 0

Turns out it was a bad GIF. Whether it was the title or the file itself I'm not sure.

Regardless, better error descriptions and handling in general lead to a better world :)

4 months ago 0 0 0 0
How S3 works

How S3 works

🧵 19/19 To recap, S3 is the world's hard drive.

It's cheap, fast enough, and extremely reliable for most scenarios.

Smart system design helps AWS cut costs, save on compute and storage, while also maintaining high availability and acceptable latency.

Now you know!

/rant

4 months ago 0 0 0 0
S3 is alive!?

https://www.allthingsdistributed.com/2023/07/building-and-operating-a-pretty-big-storage-system.html

S3 is alive!? https://www.allthingsdistributed.com/2023/07/building-and-operating-a-pretty-big-storage-system.html

🧵 18/19 Crazier still is that AWS doesn't just add more and more features, they proactively edit their codebase.

In 2021 they rewrote a fundamental service, ShardStore, in Rust. It has 40k LOC and is frequently updated without interruptions to service.

4 months ago 0 0 1 0
S3 Metadata for better data access and usability

S3 Metadata for better data access and usability

S3 Tables for better performance, access control, and structure

S3 Tables for better performance, access control, and structure

🧵 17/19 S3 Tables deals with Parquet files and can optimize how they're squished together and stashed away.

Meanwhile, S3 Metadata makes it much easier to search and organize all that data. It makes the data that is already present more useful.

4 months ago 0 0 1 0
Big Data has seen big growth over the last decade and is not slowing down

Big Data has seen big growth over the last decade and is not slowing down

🧵 16/19 One of the newer, bigger user groups on S3 is data analytics. Think endless stuff in data lakes.

It's one thing to get data, but how to use it?

AWS is making changes to their infra based on this shift, and they now have automated key workflows that users had to implement themselves.

4 months ago 0 0 1 0
One example of SSDs reducing overall costs and compute for a customer

One example of SSDs reducing overall costs and compute for a customer

🧵 15/19 The most recent storage class even uses SSDs!

Here this makes sense because utilizing expensive SSDs for the right data means savings, both in time and money for AWS as well as customers.

It acts as S3's RAM--serving objects with very low latency so elsewhere compute is minimized.

4 months ago 0 0 1 0
S3 storage classes vary greatly in retrieval time and use cases

https://cloudiamo.com/2024/12/13/s3-lifecycle-or-intelligent-tiering-object-size-always-matters/

S3 storage classes vary greatly in retrieval time and use cases https://cloudiamo.com/2024/12/13/s3-lifecycle-or-intelligent-tiering-object-size-always-matters/

🧵 14/19 Similarly, S3 offers many different tiers of storage to optimize the spread of hot and cold data throughout the system.

You can make rules yourself with S3 Lifecycle or automate the process with Intelligent-Tiering.

Retrieval times vary more than 10^6x from single-digit ms to 12 hours 🕦

4 months ago 0 0 1 0
Advertisement
Balancing a growing operation by moving cold data on to new racks first for even distribution as hot data arrives

Balancing a growing operation by moving cold data on to new racks first for even distribution as hot data arrives

🧵 13/19 That's where balancing data comes in.

AWS engineers designed the system to preload colder data onto new storage racks to maintain an even distribution as newer, hotter data arrives.

4 months ago 0 0 1 0
Data access over time in S3 cools down

Data access over time in S3 cools down

🧵 12/19 When data moves into S3 it starts hot and gradually cools, i.e. it's used more often when it's young and is accessed less frequently as it ages.

This fact of life could mess with operations if not dealt with properly.

4 months ago 0 0 1 0
Video

🧵 11/19 Another cool feature of S3 is that it becomes more predictable and resilient as it scales.

Since data is accessed across hard drives, and you can't really forecast reads from customers, having a large operation smooths out aggregate demand.

As the service grows, it becomes less spiky.

4 months ago 1 0 1 0
Shuffle sharding and erasure coding can eliminate long tails when it comes to latency. Long requests are retried after exceeding p95 times

https://youtu.be/NXehLy7IiPM?si=sUv9AY6Xs7RCHgs_&t=1948

Shuffle sharding and erasure coding can eliminate long tails when it comes to latency. Long requests are retried after exceeding p95 times https://youtu.be/NXehLy7IiPM?si=sUv9AY6Xs7RCHgs_&t=1948

🧵 10/19 Erasure coding also helps with testing code in production. Since everything is super redundant, it's fine if things break in prod.

S3 even eliminates long tails by canceling requests that go over its p95. The request is resent to a different server, and thus a different shard. And it works!

4 months ago 0 0 1 0
Erasure coding is safe and efficient but compute heavy

Erasure coding is safe and efficient but compute heavy

🧵 9/19 S3 achieves such high durability through erasure coding, essentially splitting up objects into chunks. But also doing some voodoo magic on that data and storing that, too.

The advantage of this approach is that instead of needing 3x storage from straight replication, data can be safe at 1.8x

4 months ago 0 0 1 0
That's basically never

That's basically never

🧵 8/19 And not only is it big, it is reliable.

S3 is designed for 99.999999999% data durability. Famously that's 11 nines.

Say you had 10,000 objects in S3, math dictates that you'd lose 1 object in 10,000,000 years.

4 months ago 0 0 1 0
S3 by the numbers

S3 by the numbers

🧵 7/19 To say S3 is a large service is an understatement. Look at these numbers 👀

4 months ago 0 0 1 0
Advertisement
The evolution of HDDs is impressive

https://highscalability.com/behind-aws-s3s-massive-scale/

The evolution of HDDs is impressive https://highscalability.com/behind-aws-s3s-massive-scale/

🧵 6/19 For the past 30 years HDDs have been stuck at 120 IOPS. And they might be forever.

Progress elsewhere, however, isn't slowing yet, and there are already solid sketches of 200TB drives within the next 10 years.

So, design around the constraint! Shard, and shard hard.

4 months ago 0 0 1 0
HDDs are beyond accurate

https://www.allthingsdistributed.com/2023/07/building-and-operating-a-pretty-big-storage-system.html

HDDs are beyond accurate https://www.allthingsdistributed.com/2023/07/building-and-operating-a-pretty-big-storage-system.html

🧵 5/19 Slight detour--hard drives are wonderful and illustrate the insane progress in hardware over the last 75 years.

The catch is that they are constrained for I/O 😭

4 months ago 0 0 1 0
The power of two random choices for better storage capacity across the fleet

https://youtu.be/NXehLy7IiPM?si=bbT8qAqM80-AEze_&t=2117

The power of two random choices for better storage capacity across the fleet https://youtu.be/NXehLy7IiPM?si=bbT8qAqM80-AEze_&t=2117

🧵 4/19 The way data is added to the system is called shuffle sharding. It's totally random, but not just regular random.

Before committing to a drive, S3 actually looks at 2 random drives, then picks the least used one.

This small change has outsized impact in organizing and spreading out data.

4 months ago 0 0 1 0
Video

🧵 3/19 Basically S3 operates by spreading out simple GET and PUT HTTP requests across many servers and stores sharded data on insanely cheap--and slow--hard disks.

Since S3 leverages massive parallelism, customers hardly notice any lag. Some customers have data stored on over a million hard drives!

4 months ago 0 0 1 0
Netflix <3 S3

https://www.cloudzero.com/blog/aws-biggest-customers/

Netflix <3 S3 https://www.cloudzero.com/blog/aws-biggest-customers/

🧵 2/19 Amazon's Simple Storage Service (S3) came onto the scene in 2006 as a backup utility and place to keep media.

It has grown and evolved a lot in the past two decades!

Its biggest customer today, Netflix, wasn't even streaming video in 2006!

Still, S3's core concepts remain unchanged.

4 months ago 0 0 1 0
The Professor says "Let's crack in" to learning about S3!

The Professor says "Let's crack in" to learning about S3!

🧵 1/19 While prepping Brussels sprouts for Thanksgiving I took the time to dive deeper into S3's architecture, and it's pretty sweet and genius.

Thread incoming.

4 months ago 0 0 1 0

No prob! Think I identified the form issue here:

github.com/overcommitte...

4 months ago 0 0 0 0
Buffer says it's up

Buffer says it's up

Bluesky says it's not

Bluesky says it's not

Uh oh, I guess trying to post a 19-part thread via Buffer was not a good idea. Who knew?

Well, at the very least I learned a bunch about S3 the other day. Here's hoping I can retrieve it 🤞

4 months ago 1 0 1 0

Just started digging into some re:Invent videos on YouTube and it's nice to be able to learn so much about AWS's infra + design philosophies!

Looking forward to new material 👍

4 months ago 2 0 0 0
Advertisement
Preview
GitHub - mainmatter/svelte-lynx-integration: A POC for the svelte-lynx integration A POC for the svelte-lynx integration. Contribute to mainmatter/svelte-lynx-integration development by creating an account on GitHub.

Might be out of date, but there's this:

github.com/mainmatter/s...

4 months ago 2 0 0 0
Kids coding Mission 12

Kids coding Mission 12

LEGO robotics!

LEGO robotics!

Yesterday we made some good progress as we enter the final push before our first comp.

The kids tweaked the robot design ever so slightly and increased the consistency of one of their combo moves.

I thought they should move on, but they were just so happy to see it repeat itself for 15 minutes.

4 months ago 1 0 0 0

I think the slop part is key--it's something that underdelivers no matter the context and is probably gratuitous or erroneous to an extent.

4 months ago 2 0 1 0
Bait and switch

Bait and switch

https://www.x402.org/x402.pdf

https://www.x402.org/x402.pdf

Reading up on x402 and I feel lied to with this 'one-pager' being 2 pages 🙅‍♂️

Guess they meant front and back.

4 months ago 0 0 0 0
In Zen, Firefox-based

In Zen, Firefox-based

In Arc, Chrome-based

In Arc, Chrome-based

@overcommitted.dev Was trying to submit some feedback on links in show pages but it appears the contact form isn't working?

Tried on multiple browsers with ad blockers off and it always 404/405s.

4 months ago 2 0 1 0