It takes 10 times as much energy to kill good ideas as it does to create them.
Posts by Craig
For the past five years we've focused on production ready Postgres, which is unique from most other database providers. Nothing about that focus changes today, me and the entire team are as laser focused as every on that, but now with an expanded mission of a unified data platform.
Increasingly a first class Postgres experience isn't complete without a seamless analytics experience.
Five years ago I joined @crunchydata.com, shortly after I wrote about having unfinished business with Postgres. Today as part of Snowflake that journey is continuing. We've built some amazing things, but are just getting started.
www.crunchydata.com/blog/crunchy...
We've got food duty at the first kids travel tournament of the year, was thinking about trying to pull off good quality pour over coffee. Anyone have good mechanisms when unclear a good power source for kettle exists?
The design of everything from reception to HR to engineering mattered, appreciating and prioritizing design at that level definitely set it apart at a time where that wasn't common.
Way back at Heroku when were having a company onsite... we took the entire company to the SF MOMA for a Dieter Rams exhibit.
It wasn't just the designers focused on design it was everyone.
Production ready Postgres.
That's it.
👋 if you want to come over to near Berkeley
Respect.
The details matter. In this case, completely revamping our newsletter signup screen ahead of an upcoming conference.
I mean why wouldn't you execute SQL to sign up for a database newsletter?
New release of pg_parquet including:
* Amazon S3
* Azure Blob Storage
* Google Cloud Storage
* http(s) stores
* local files
Still the easiest way to simplify some of your ETL allowing Postgres and parquet to play well together.
www.crunchydata.com/blog/announc...
Iceberg has gone from being the thing Netflix (and then Apple) built for their own enormous data lakes to a collaborative open standard where even the competition is learning to co-operate and align: I dug into the Iceberg summit and asked @craigkerstiens.com and others why it's so useful
Happy revenge of the sith day to all who celebrate.
Updating my nulls today with:
\pset null 🤖
Can't decide if it's for R2-D2, C-3PO, or BB-8, but closest I can get for May the 4th. May keep it for a few days...
A few weeks in, but still every time I demo this it feels absolutely magical. Finally convergence of transactional and analytical data. www.crunchydata.com/blog/logical...
Now live on the Crunchy Bridge dashboard the ability to seamlessly sync data between your operational database over to your data warehouse for analytics - docs.crunchybridge.com/changelog#da...
If I had known we didn't have to fully manage the Iceberg for folks we might have shipped so much sooner 😂
Was sort of discussed in an office hours session yesterday, you can "sort of" do it with pyIceberg, but only sort of, it's not really an "easy" button for it. But still don't think that's the primary reason people aren't.
My guess would be people are just using Iceberg connectors, see things seem to work, and compaction is an after-thought. That's why IMO good Iceberg support is not a connector feature but an engine feature. Like what Crunchy is doing with their DWH, or, for Kafka, Confluent with Tableflow.
Nope, didn't really get into it in that much detail lots of quick hallway conversations.
A shocking take-away for me from a few weeks ago at Iceberg Summit and similarly at Data Council today is for all those using Iceberg yet so few doing compaction on their data lake files.
To me seemed a requirement for any production Iceberg usage, otherwise you wake up shocked in a few months.
Yesterday met someone at Data Council that was very familiar with @crunchydata.com team and described us as basically being team Avengers for Postgres/databases.
That's a new one, but will totally take it.
With yesterdays launch of logical replication for our Data Warehouse. Now you can still use CDC for your other data pipelines, but for your primary Postgres from operational -> analytical that is solved for you–without buying yet another tool.
In talking with customers that were using CDC tools to get data from Postgres into a data warehouse, 80% of their spend on ETL was the Postgres data movement.
Further many of these tools often caused production outages.
We knew as soon as we launched Crunchy Data Warehouse we had to solve that.
Sitting in Ryan Blue’s talk at Data Council and about to leave from Q&A to head to office hours and question comes up about CDC from databases to Iceberg… @marcoslot.com makes a hard U-turn to tune in
@andypavlo.bsky.social on HN sums it up well
OLAP vs. OLTP isn't right vs. wrong, they're designed for different things.
Have a version of this on a slide when explaining Crunchy Data Warehouse and how it's for very different purposes than stock Postgres.
And there it is: Native logical replication from any Postgres server to Iceberg managed by Crunchy Data Warehouse.
Speed up Postgres analytical queries 100x with 2 commands.
Today we're announcing the availability of logical replication from Postgres to Iceberg with Crunchy Data Warehouse.
Now you can seamlessly move data and stream changes from your operational database into an analytical system.
www.crunchydata.com/blog/logical...
One of the best parts, because it builds on native Postgres logical replication you could also leverage for larger data sets...
- Ingest in Postgres with partitioning
- Add/remove partitions to the replication set
- Retain all your data in Iceberg
- Smaller recent set in Postgres