Introducing Bounce: a new tool that uses Bridgy Fed to migrate your social graph b/w ActivityPub and ATProto.
Yup, we built a service that moves your social graph *across* protocols.
And the best part? You get to keep *all* of your followers, along with many of your follows.
Posts by Diana Enache (Darie)
π Itβs time to reclaim social media.
Billionaires & venture capital shouldnβt control our digital lives.
#FreeOurFeeds is raising $4M to build a public-interest alternative.
Chip in today to make it happen freeourfeeds.com
2024 was a big year for atproto!
We've got an update looking back at progress that has been made, and some protocol work in progress. Will have a 2025 roadmap to share soon.
Came to accept this is just natural but never not thinking about Hayao Miyazakiβs creative process.
What does it mean for something to be Turing complete?
I answer this question, and more, through a series of fully interactive Turing machine simulations! Play, pause, step forwards and backwards, and even write your own Turing machine programs in my latest blog post.
samwho.dev/turing-machi...
(10/10) Next step for us, try using the lambda approach instead. Spend more time from poc to mvp and prove the solution is feasible for prod load.
(9/10) So you take the wrong decision. What next? Do you live with the consequences and live with a not so perfect solution or do you start from scratch and take another approach?
(8/10) According to the docs, the shard iterator should be sequential. In practise, we never managed to prove this, as we never got back an iterator we've seen before. Testing with production-scale data could reveal architectural flaws that aren't visible in small-scale pocs.
(7/10) It seems in practise open DynamoDB Streams can spawn thousands of empty shards for no apparent reason, making what should be simple stream processing surprisingly complex. We run this on a prod table with a relatively fair load. We realised we ended up processing thousands of empty shards.
(6/10) There are multiple approaches to reading from streams. Preferred ones are via #lambdas or the #kinesis adaptor. For various reasons this wouldn't work for us. We are left with the manual approach of polling the streams ourselves. According to the docs this should have been pretty easy to do.
(5/10) Any #dynamodb table with dynamodb streams enabled can have one dynamodb stream active. At any point in time, we can only have one active stream. A #stream can have multiple shards. A shard can contain zero or more records. A record is the change that we want to be captured.
(4/10) #DynamoDB Streams captures a time-ordered sequence of item-level modifications in any DynamoDB table and stores this information in a log for up to 24 hours. Applications can access this log and view the data items as they appeared before and after they were modified, in near-real time.
(3/10) The above is not ideal, too many extra resources and extra complexity. Intriguing alternative would be a CDC approach. Which is what we chose and relied on #DynamoDBStreams for this.
(2/10) So this is more on lessons learnt when pocs and balancing trade offs aren't enough. Our only requirement: almost real time audit logs on all dynamodb tables. Possible approach - record any db change via events and the transactional outbox pattern.
(1/10) Some lessons learnt from a project where I tried to build an audit logging solution. #SystemDesign often forces trade-offs between reinventing the wheel (risky but adaptable) and using existing solutions (reliable but potentially inflexible).
blog.theengineeringcompass.com/p/building-a...
After sending countless throwaway accounts to die on my PDS, this is finally good enough to ship: www.npmjs.com/package/blue...
It's a CLI wizard for migrating your Bluesky account from one PDS to another. If you feed it the right credentials and have a stable connection, it should Just Workβ’.
I need more convos like this.
The good news and the key difference is in the architecture. The infrastructure is built to allow users to run their own services, migrate their data, and participate in the network without relying on Bluesky's systems. The foundation is built for users and developers to break free when ready. 6/6
While Bluesky's design allows for decentralisation, today it runs more like Twitter: most users are on Bluesky's servers. Running a full node requires 16TB of fast storage - showing the practical challenges of true decentralisation. 5/6
Your Bluesky identity combines an easy-to-read username with a permanent cryptographic ID. Think of it like having both a memorable email address and a secure passport number - you can change the first while keeping the second. 4/6
The platform allows developers to create their own interfaces and algorithms since data is separated from presentation. This means you could see the same content through different apps, each with its own unique take. 3/6
atproto uses a shared data pool instead of server-to-server messaging. This is different from how email or Mastodon work, but allows for better data portability and custom interfaces while maintaining network-wide consistency. 2/6
Bluesky uses a protocol called the AT Protocol (Authenticated Transfer Protocol) aka atproto, which works similar to email in some ways. Just like how email allows you to use Gmail but still message someone with a Yahoo account, Bluesky lets you choose different providers to host your account. 1/6
Bluesky's journey toward decentralised social media faces real-world scaling challenges. While their innovative protocol promises user freedom and data portability, most of their 20M+ users still rely on Bluesky's infrastructure.
Some takeaways: blog.theengineeringcompass.com/p/decentrali...
Huiβs and Thngβs proposed an optimisation to CQ which they called FELT (Far Future Event Leaf Tree). And Tan, Kah Leong, and Li-Jin Thng proposed the The Snoopy CQ. Both data structures are worth diving into. More info and links to specific papers can be found in the article linked above.
The two level alg achieves O(βn) for insert and search in worst complexity. But this can be improved.
In 1988, Randy Brown introduced Calendar Queues (CQ), which is organized like a calendar, divided into "buckets" (like days or time slots). Each bucket represents a fixed-width time interval.
The Two-Level (TL) algorithm proposed by Franta and Maly consists of 3 layers:
- the index list layer - divides the range of event times into equal-sized intervals.
- the secondary key layer - points to sublists of event notices
- the event notices layer - holds the event times
In the Unix System V, the cron calculates the next executions for each task, adds them to the Franta-Maly event list, along with their execution times. It then examines the first task in the queue, calculates its future execution time, sleeps until that time, then wakes and executes the task. [2/*]
The original cron was designed by Ken Thompson in Unix V7 (~1979), later rewritten by Dennis Ritchie. It would read the crontab files, used an in memory table for scheduling and had one main process that woke up every min, check if any commands need to be run, and if so, run them as root. [1/*]