Knock knock!
Posts by Julien Le Dem
Thank you for letting me keynote day 2 of the Iceberg summit. I hope you all enjoyed it!
I don’t really remember but it’s quite possible :)
And that’s a wrap! Thank you for a great Iceberg Summit!
@thedanicafine.bsky.social, Matt and everyone involved in the organization.
I have LLM jokes, they write themselves.
Paging @dickc
If you want to learn more about the Variant type in Parquet, Aihua Xu and @andrewlamb1111.bsky.social wrote a great blog post on the project blog.
parquet.apache.org/blog/2026/02...
Great inaugural post about the geospatial types on the Parquet blog.
Thank you Jia Yu, Dewey Dunnington , Kristin Cowalcijk, Feng Zhang.
More posts coming !
parquet.apache.org/blog/2026/02...
Apache Arrow is ten years old!
I can't believe that Apache board meeting when the project was instated was 10 years ago!
This went fast!
arrow.apache.org/blog/2026/02...
Congratulations to @andrewlamb1111.bsky.social on becoming the latest Parquet PMC member!
Thank you, Andrew, for your leadership in the community. I look forward to our continued collaboration!
File formats!
Great references 😎
Happy new year
Bonne année!
And that applies to sparkling wine as well ;)
I always dip the fries in the tartar sauce with my fish and chips.
😒Is it a kissing book?
I also gave a talk on this topic.
Slides: docs.google.com/presentation...
Recording here: www.youtube.com/watch?v=S_ao...
In the past few years, we’ve seen a cambrian explosion of new columnar formats, challenging the hegemony of Parquet. Presumably, the design of yore is not going to cut it moving forward. I spent some time to understand a bit better how things actually changed.
sympathetic.ink/2025/12/11/C...
If you missed my talk "The advent of the open data lake" at AI By the Bay, the recording is now available.
ai.bythebay.io/talks/the-ad...
www.youtube.com/watch?v=xHGV...
Earthquake!
It turns out that Friday’s NYT’s mini crosswords was written by my 13-year-old.
Parquet praise in the wild :) Nice chatting with you at Datacouncil Hannes!
www.youtube.com/watch?v=zeon...
As compute and storage can be efficiently decoupled, a common storage layer enables a vibrant ecosystem of on-demand tools specialized to specific use cases that avoids vendor lock-in.
ai.bythebay.io/talks/the-ad...
In this talk I’ll discuss the impact of the cloud and the advent of the Open Data Lake breaking silos to form the foundation of this ecosystem.
It’s been incredible to see the adoption of key components like Parquet, Arrow, Iceberg, and DataFusion. They provide an interoperability layer that enables using data without creating silos and duplication.
The components of databases, distributed or not, have been commoditized as individual parts that anyone can compose into use-case specific engines. Define your constraints and build a query engine that solves your problem.
Over the past decade, the big data ecosystem has matured and evolved from a melting pot of competing projects into a composable ecosystem organized around a few open source standards.
Come say hi, Wednesday at 10am. I'll be speaking at the AI By the Bay Conference about "The Advent of The Open Data Lake".
ai.bythebay.io/talks/the-ad...
Circular limit. Yellow and green
Step 2
3 more to go
Circle limit green layer
Take 2 step 1