I'll be at #ICWSM 2025 next week to present our paper about Bluesky Starter Packs.
For the occasion, I've created a Starter Pack with all the organizers, speakers, and authors of this year I could find on Bluesky!
Link: go.bsky.app/GDkQ3y7
Let me know if I missed anyone!
Posts by Michał Król
Are you interested in @bsky.app research? We gather and make available multiple datasets including posts, social graph snapshots, labellers, feed generators and more. More information is available on our website: bsky.leobalduf.com
Work done with: @ignactro.bsky.social @garethtyson.bsky.social @leobalduf.bsky.social @bibo7086.bsky.social @asonur.bsky.social @baronca.bsky.social
If you want more, our full paper is available here: arxiv.org/abs/2501.11605
Our data was also used (and complemented by surveys!) in this @nature.com article: nature.com/articles/d41...
🧵 11/11
However, users also spot some things to be wary of. People flag that SPs can be used as lists of accounts to block or even harass. Some users also dislike being added to starter packs without their consent, especially when this is used to legitimize malicious SPs.
🧵10/11
And what do users think about SPs? They are positively perceived with ~5x more positive than negative comments back in July. The ratio increases over time with ~10x more positive comments at the end of 2024. The SP perception improves as the users start to use them more.
🧵9/11
Popular accounts are more likely to be included in SPs, increasing their existing popularity even further. We observe a “rich get richer” effect, with the gap between more and less popular accounts widening over time.
🧵8/11
What macro-level impact do SPs have on the social graph? SPs provide tighter connections within existing communities rather than promoting connections across the entire system. This raises the possibility of SPs exacerbating potential echo chambers.
🧵7/11
But is it beneficial to be an SP member or creator? We assessed the impact on users' visibility. In the first week after its inclusion, the members received ~39% more follows than similar users not in SPs. This trend reaches 85% after four weeks.
🧵6/11
However, the created SPs ended up being highly impactful! Over the last 6 months, they accounted for up to 40% of all daily follow operations and represent an impressive ~20% of all the follow relations in the network!
🧵 5/11
Starter packs (SPs) were introduced in June 2024. Since then, 335k SPs have been created. Is it a lot? By the end of 2024, 1% of users created at least one SP, 6% were members of at least one SP, and 4% had employed the follow-all operation on an SP.
🧵4/11
@bsky.app grew from ~2.5M users in Jan'24 to ~25M in Dec'24. An impressive 10x increase in just a year! This was driven by (1) opening to the public, (2) Twitter/X banning in Brazil; (3) X’s change making content visible to blocked users, and (4) the 2024 US elections.
🧵3/11
But are the SPs used? Do they bring benefits to their users and the social network? And what do users think about the starter packs? We answer these questions in our recent study!
🧵2/11
@bsky.app grew tremendously over the last year. Recently, the platform introduced “starter packs” (SP) to help users bootstrap their social graph. Anybody can create these SPs. They’re lists of accounts (e.g., “people creating anime”) that others can then follow.
🧵1/11
More than 5xusers and well beyond 300K starter packs since April 2024. How did Bluesky looked before the stampede out of X?👇 [1/n]
📢Networking people, I am creating a Bluesky starter pack (bsky.social/about/blog/0...) for people working/interested/researching in the networks area. Add your Bluesky handle if interested 👉 forms.office.com/e/utWc3P1PYr
The full paper "Guardians of the Galaxy: Content Moderation in the InterPlanetary File System" is available here: usenix.org/conference/u...
Written with @ignactro.bsky.social @asonur.bsky.social @bibo7086.bsky.social @leobalduf.bsky.social
Finally, we showed multiple ways of circumventing the moderation techniques implemented by HTTP gateways. This includes changing the CID scheme versions, using relative paths and alternative hash functions.
🧵 6/7
Phishing websites are delivered to victims via gateways that serve @ipfs.tech content via HTTP. We found that some of the most popular gateways are used almost uniquely to serve phishing websites.
🧵 5/7
The number of requests for problematic content is steadily increasing. The majority of the traffic is related to shadow libraries (Anna’s Archive) migrating to @ipfs.tech. However, we also observed a sharp increase in the number of phishing websites hosted on the platform.
🧵4/7
We used ML to create textual descriptions of every CID and represent them as word clouds.
For terrorist content, we observe collections of literature acting as motivation ("love"). We also spot guides on how to blend into society without raising suspicions ("home").
🧵3/7
We recovered 417,912 Content Identifiers (CIDs) that were moderated by different @ipfs.tech entities. We then downloaded, analyzed and classified 368,762 (≈ 86.60%) of them. Most of the CIDs represent DMCA. However, we also discovered phishing, terrorist content or CSAM.
🧵 2/7
@ipfs.tech is the largest decentralized file storage. In our recent Usenix Security paper, we answer questions about content uploaded by users:
1) What kinds of “problematic” content is uploaded to @ipfs.tech?
2) How is this content moderated?
3) How effective moderation is?
🧵 1/7