Vignesh Chandramohan (@vigneshc) Bsky

Exploring Table Formats - Iceberg & SlateDB What is common between Apache Icerberg, SlateDB and object-store first table formats? Explore with real examples.

Exploring Apache Iceberg and SlateDB formats - with a repo link for additional exploration.

datapapers.substack.com/p/exploring-...

1 week ago 7 1 0 0

Tests are more comprehensive than spec level validation.

Isn't spec still more deterministic, more token efficient, and theoretically makes coding agents converge faster? Would specs being more approachable for human reviewers, and generated by coding agents change the equation?

1 week ago 1 0 1 0

Supporting Our AI Overlords: Redesigning Data Systems to be Agent-First Large Language Model (LLM) agents, acting on their users' behalf to manipulate and analyze data, are likely to become the dominant workload for data systems in the future. When working with data, agen...

Related items:
arxiv.org/abs/2509.00997 - talks about agent's query patterns, and how data systems should adapt.

www.malloydata.dev - another promising query language, the claim is complex queries are expressed in simpler form than SQL, making llms make less mistake.

4 months ago 5 0 1 0

@jayaprabhakar.bsky.social Interesting read on formal verification.

4 months ago 2 0 0 0

Iceberg Use Cases at DoorDash YouTube video by Apache Iceberg™ Meetup

Short talk on Iceberg use cases in last week's Seattle Iceberg meetup.
youtu.be/F7qpOVVnxek?...

5 months ago 2 0 0 0

It has three increasingly more verbose levels of description. They are probably trying to optimize the initial set of searches. And with markdown based skills, custom workflows are approachable for a broader audience compared to MCP, and probably safer too.
It is only available in Claude desktop.

5 months ago 2 0 0 0

SF Systems Meetup: Databases and Stateful Apps · Luma The SF Systems Meetup is back with a pair of talks giving us a peek at the future of state management! This month, we're excited to have talks from Almog Gavra…

If you find yourself in SF next week, @almog.xyz is talking about SlateDB at the SF Systems Meetup on Wednesday!

5 months ago 17 4 0 0

Tools, prompts, sampling - all these seem to be a result of generalizing how Claude code / research feature was built over time, and extracting ask out of those patterns. Uniformity is the biggest and probably only benefit.

And maybe the goal is not about solutions that spend less tokens?

6 months ago 2 0 0 0

Intend to follow this along. Done with chapter 1, looking forward to the next one!

6 months ago 1 0 0 0

2/ This mindset improves productivity and outcome significantly overtime.

6 months ago 2 0 0 0

Time to Adapt and Become a Better Engineer A colleague recently asked if I think AI agents could do all the jobs humans do. My answer was an unequivocal yes. Given the same sensory…

1/ Nice read:
medium.com/@xiafan/time...
Software engineer growth in AI era:

* Build composable tools.
* Assume non deterministic outcomes from a group of ai agents.
* Understand how LLM works to the next level, like how you would understand to read a query plan.

6 months ago 2 0 1 0

SQLync

Just added sqlync.com to SlateDB's adopters list! They're building a streaming system that speaks MQTT or PostgreSQL across millions of connected users and devices. 🤯

7 months ago 5 1 0 0

3/ And as an extension, how it handles maintenance operations such as vacuum on iceberg tables dones out of band.

7 months ago 3 0 0 0

2/ I assume the iceberg writes uses iceberg open source libraries. This would ensure the write part continues to evolve with iceberg advancements.

I don't yet know if this handles compacted topics (which would introduce deletes on iceberg)

7 months ago 2 0 1 0

Introducing Tableflow: Unifying Streaming and Analytics Seamlessly integrate Apache Kafka data into your lakehouse as Apache Iceberg or Delta Lake tables, bridging the operational and analytical divide, with Tableflow. Read more in our blog post.

1/ Leveraging Remote storage manager and storing Kafka segments as parquet files + iceberg metadata is really good. Avoids having to consume, serialize and manage a separate process.
I wonder if confluent's TableFlow launched about a year back has a similar design. www.confluent.io/blog/introdu...

7 months ago 1 0 1 0

Love the idea. Could some of these eventually become sub projects, and hosted in the SlateDB organization as a separate repo? Starting projects that have that potential as GitHub issues with a specific tag would make it easy to track.

9 months ago 1 0 1 0

Insane amount of SlateDB work going on:

- snapshot reads
- split/merge DBs (zero copy)
- deterministic simulation testing

And someone just pushed Python bindings in a PR! 🤯

10 months ago 10 3 0 0

Internals of SlateDB: An Embedded Key Value Store Built On Object Storage YouTube video by Data Council

My Data council talk on SlateDB.
youtu.be/gcTRXZeKbNg?...

10 months ago 22 3 0 0

Got it. So, if I wanted a view to update, say once an hour incrementally, would I create a "hourly view" that uses now() and join against it?

10 months ago 2 0 1 0

Clock tick as an input is indeed a way to model it! Would the clock tick table be joined in all views that need this property?

10 months ago 0 0 1 0

Finally got to read this.
One additional aspect to ivm, is reasoning about the data in the computed. For a lot of use cases, it is often easy to think of a view/table to move in predictable increments (day, hour, 15 minutes etc). This notion is not modeled as a first class concept in many.

10 months ago 1 0 1 0

SlateDB 0.6.0 is out!

github.com/slatedb/slat...

Highlights include a hybrid cache (using Foyer), a lot of internal cleanup, and more groundwork for transactions.

Oh, and put performance jumped ~80% for write-heavy workloads :)

slatedb.io/performance/...

11 months ago 8 1 0 0

SlateDB - An embedded storage engine built on object storage | SlateDB Description will go into a meta tag in <head />

Today marks SlateDB’s one year anniversary! It’s been a lot of fun. Thanks to @rohanpd.bsky.social @flaneur2024.bsky.social @almog.ai @vigneshc.bsky.social @paulbutler.org Jason Gustafson, David Moravek, and many others for joining the project. 😀

11 months ago 16 5 0 1

🎂 Commonhaus Turns One — A Look Back, and the Road Ahead Commonhaus Foundation celebrates its first anniversary and lays down expectations for its future

Commonhaus is 1! 🎂
14 projects, solid foundations, and more on the way.

If you believe in light governance, shared care, and thoughtful support for open source, come see what we’re building.

www.commonhaus.org/activity/253...

1 year ago 31 19 0 1

Yo SF Bay Area #databs crew, want to talk lakehouses at a real Lake House? :)

Next week after Data Council, join the founders of @clickhouse.com, @motherduck.com, @startreedata.bsky.social, and @tobikodata.com to talk real-time databases and next-generation ETL.

www.rilldata.com/events/data-...

1 year ago 10 3 1 0

Release v0.5.0 · slatedb/slatedb What's Changed Refactor Block Tests to Use Table-Driven Test Cases by @samsond in #410 Update await calls in README.md by @criccomini in #425 chore: Apply table driven test for sst.rs by @jeffreyl...

SlateDB 0.5.0 is out!

Features:
- Checkpoints
- Clones
- Read only client
- Split/merge database foundation
- TTL filtering on reads
- Last version with breaking byte format changes

By the numbers:
- 62 commits
- 2 new contributors
- 10 total contributors

github.com/slatedb/slat...

1 year ago 22 3 2 1

Building composable data systems: Why, How and Standards Standards improve interoperability. Reusable libraries built around standards drive adoption. In this post, we explore key papers and real-world examples.

datapapers.substack.com/p/building-c...

New post.

1 year ago 4 1 0 0

CALL FOR GRAND CHALLENGE SOLUTIONS DEBS2025

DEBS conference hosts a grand challenge every year. This year's challenge is detecting outliers in a stream of images from laser powder bed fusion.
The challenge involves submitting a kubernetes app (constraint: 2 cores 8 gb). Interesting to try if you have the time!
2025.debs.org/call-for-gra...

1 year ago 1 0 0 0

Great episode!
Towards the end @vanlightly.bsky.social mentions about alloytools.org finding a data model bug.
Never thought of an intersection between data model and formal verification. Do you have more details on this?

1 year ago 0 0 1 0

Python Folks - which data/workflow engine has the best developer experience for packaging code? We have looked into - Modal, Beam, Airflow, Flyte, AWS Lambda, Prefect, Dagster and Spark. Haven’t seen any approach which is fast, reliable and intuitive.

1 year ago 10 2 6 0

Posts by Vignesh Chandramohan