Some excellent deep-dive blogs from Kakao on their use of Flink CDC and Iceberg:
tech.kakao.com/posts...
tech.kakao.com/posts...
tech.kakao.com/posts...
Posts by Elad Leev
I'm excited to be speaking at SREday London 2025!
Join me on March 27-28 to hear how we built a foundational, cloud-native, Streaming Platform using the operator pattern (and some magic ✨)
Register here:
sreday.com/2025-london-...
Don't forget to use the code 𝗟𝗗𝗡𝟭𝟬 for 10% off!
Curious to know what connectors are mostly in use, and how many of them are in-house/forks?
Other frameworks: Apache Flink is too heavy and it requires a cluster
Apache Flink: let's process 100MB/s of data in Kafka with only 100 MB of memory heap and only one single JVM instance 😎
Great (and useful) talk! www.youtube.com/watch?v=04pw...
Slightly different from my usual posts, but I wrote a thing!
This time, on how to transform Slack from a distraction into a productivity powerhouse for engineering teams with proven strategies for channel organisation, automation, and data-driven insights -
leevs.dev/effective-co...
A 5-10% performance improvement with every new Java stable release is a dopamine hit... for CFOs. If you work with the JVM ecosystem, you should upgrade often, it's easy money
What’s your take on this implementation? I have love-hate relationship with it. Totally get why it was built that way (consumer offsets, Schema Registry topic, kStreams state, etc etc), but does it stand the test of time?
This is really useful: Awesome Data Catalogs and Observability Platforms: github.com/opendatadisc...
(h/t @adipolak.bsky.social for sharing this with me)
#dataBS
Unfortunately, not really. We have a very (very) raw version of it open sourced, but it’s missing 7 months of advancement.
I do know there is an implementation which about to get merged:
github.com/apache/flink...
Worth giving it a try!
Also:
go.bsky.app/U1hZhum
BigTech is moving to object storage too, but not cuz it's cheaper. The idea is you no longer need every system to be distributed and durable. Everything can be stateless and simple, cuz your storage is distributed and durable.
definitely the coolest feature so far on 🦋
For those who are interested in keeping track of Apache Flink's disaggregated state storage work, this is the umbrella FLIP:
cwiki.apache.org/con...
Having said that, everything is a trade off, and although it put more load (in terms of eng time) on the Data Platform teams, our clients are extremely happy with it. Proto tooling (code gen, grpc, protovalidte) are just superior to Avro
We’ve made the decision to go full Proto this time (always used Avro as main format in past companies). The amount of forks, bugs, and code changes we had to make to support it is insane. Most notably, we had to build our own Flink format to pull from registry, as Flink currently doesn’t support it
I made an infra engineer starter pack. Folks posting about databases, stream processing, durable execution, orchestrators, service meshes, and more.
go.bsky.app/SCZe42X