Crunchy Data (@crunchydata.com) Bsky

Logical replication from Postgres to Iceberg | Crunchy Data Blog We've launched native logical replication from Postgres tables in any Postgres server to Iceberg tables managed by Crunchy Data Warehouse.

Today we're announcing the availability of logical replication from Postgres to Iceberg with Crunchy Data Warehouse.

Now you can seamlessly move data and stream changes from your operational database into an analytical system.

www.crunchydata.com/blog/logical...

11 months ago 13 1 1 5

Query the cache hit with:

SELECT
sum(heap_blks_read) as heap_read,
sum(heap_blks_hit) as heap_hit,
sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) as ratio
FROM
pg_statio_user_tables;

For Bridge customers, cache hit data is in your clusters insights.

11 months ago 3 0 0 0

How long has it been since you checked your cache hit ratio?

Ideally most of your frequently queried data is in the buffer cache. We recommend 98-99% in the cache for transactional workloads - analytical workloads are lower.

11 months ago 5 0 1 0

Postgres + Iceberg side-by-side next to the same query with a Postgres HEAP table.

Half a millisecond for the optimized 🔥 . 10 seconds for the long method.

www.crunchydata.com/products/war...

11 months ago 4 0 0 0

11 months ago 0 0 0 0

psql tip: find your config parameters that are not default with \dconfig.

List of non-default configuration parameters:

11 months ago 4 0 1 0

Hacking the Postgres Statistics Tables for Faster Queries | Crunchy Data Blog Adding extended statistics can add information about how columns are related. Louise has some real life example queries and tips for working with extended table stats that can dramatically improve…

Postgres does a good job of keeping internal statistics of your which are used to plan how queries are executed. But Postgres doesn't always know how columns are related. In this deep dive we look at hacking the statistics for improved performance

11 months ago 7 0 0 1

Crunchy Data Warehouse: Postgres + Iceberg for Your On-premises Data Lake YouTube video by Crunchy Data

Miss our webinar on running Crunchy Data Warehouse on-premises? We cover an overview, hands on walk through with live querying our data lake to full Iceberg creation, and highlight of popular use cases for it. If you missed it don't worry we got you covered.

www.youtube.com/watch?v=Vojg...

11 months ago 1 0 0 1

The command to reset pg_stat_statements is:
SELECT pg_stat_statements_reset();

11 months ago 1 0 0 0

4) After major maintenance operations
Resetting pg_stat_statements after major operations (pg_repack, reindexes, altering table structures, etc) helps to measure how any changes affect performance.

11 months ago 0 0 1 0

3) When benchmarking query performance
If you are running benchmarks and tests to evaluate query improvements, resetting pg_stat_statements makes sure that your results reflect only the queries executed during the benchmark periods.

11 months ago 0 0 1 0

It will also help accuracy of reports (for example, the outliers insight) by preventing old, outdated queries from skewing the results.

11 months ago 0 0 1 0

2) After deploying significant query changes
If you have made significant changes to an app, optimized queries, or modified indexes, resetting pg_stat_statements helps to measure the impact of those changes more accurately.

11 months ago 0 0 1 0

1) At the start of a new monitoring period
If you analyze performance trends daily, weekly, or monthly, resetting pg_stat_statements at the beginning of such a period helps ensure that each period starts with a clean dataset.

11 months ago 3 0 1 0

We all love pg_stat_statements but that data collects forever and can get a little stale. When should you reset it?

Here are some tips from our support team.

11 months ago 1 0 1 0

Crunchy Data Warehouse: Postgres with Iceberg Available for Kubernetes and On-premises | Crunchy Data Blog Crunchy Data brings Postgres-native Apache Iceberg to Kubernetes and on-prem workloads.

Excited to announce Crunchy Data Warehouse is now available for Kubernetes and On-premises. Need faster analytics from Postgres? Want a native Postgres data lake experience? Learn more about how it works: www.crunchydata.com/blog/crunchy...

1 year ago 6 1 0 2

Postgres Tips from the Experts at Crunchy Data Advanced ideas and tricks on how to use Postgres to its best.

Working with psql directly in Postgres? These tips will help turn you into a psql power user -

1 year ago 2 0 0 0

pg_parquet: An Extension to Connect Postgres and Parquet | Crunchy Data Blog Crunchy Data is excited to release a new extension so you can write Postgres data to Parquet and or pull data from Parquet to Postgres. Craig has the details and sample code.

Need to get data to or from Postgres and into Parquet? pg_parquet is an open source extension that makes this easy, no need for complicated ETL processes -

1 year ago 5 0 0 0

Citus: The Misunderstood Postgres Extension | Crunchy Data Blog What applications and use cases make the most sense for Citus.

We talk with users every week about whether Citus is a good fit for them. While Citus is a very powerful Postgres extension it has very specific use cases where it does fit, here we break down the cases where Citus is a fit as well as when it's not -

1 year ago 1 1 0 0

Postgres Tips from the Experts at Crunchy Data Advanced ideas and tricks on how to use Postgres to its best.

When it comes to developer tools everyone has their favorite tips and tricks. And because we love Postgres we worked to capture many of our favorite tips and tricks for Postgres here in this collection -

1 year ago 5 0 0 0

Automatic Iceberg Maintenance Within Postgres | Crunchy Data Blog Iceberg can create orphan files during snapshot changes or transaction rollbacks. Crunchy Data Warehouse automatically cleans up the orphan files using a new autovacuum feature.

Excited to announce built-in maintenance for Iceberg via Postgres.

Now within Crunchy Data Warehouse we will automatically vacuum and continuously optimize your Iceberg data by compacting and cleaning up files.

Dig into the details of how this works www.crunchydata.com/blog/automat...

1 year ago 12 3 0 0

Citus: The Misunderstood Postgres Extension | Crunchy Data Blog What applications and use cases make the most sense for Citus.

Citus is a Postgres extension that turns it into a sharded, distributed, horizontally scalable database. With all these buzzwords, it attracts a lot of people thinking it can solve all their problems. We dig into when it is a good fit and when it isn't - www.crunchydata.com/blog/citus-t...

1 year ago 3 1 0 1

Pi Day PostGIS Circles | Crunchy Data Blog For a proper Pi Day celebration in Postgres, Paul shows off a proof for CIRCULARSTRING.

Happy pi day! 🥧

Postgres has a pi function: pi();

This can be used to calculate circular sizes and areas.

@pwramsey.bsky.social looks at pi in PostGIS today with a blog post on circular forms on PostGIS with a proof for the CIRCLELINESTRING shape.

www.crunchydata.com/blog/postgis...

1 year ago 4 2 0 1

SQL output can be messy. Psql options for formatting output. A handy option is

\pset border 2

This will add top and bottom borders with double lines (╔, ╚, ╤, ╧, etc.). A Header row bordered with a double-line separator . Each row is separated by single lines. Each column is clearly separated.

1 year ago 14 0 0 1

PostgreSQL is the Database Management System of the Year 2024

Congratulations to the Postgres community on PostgreSQL once again being named the DBMS of the year in 2024, for the second year in a row. db-engines.com/en/blog_post...

1 year ago 27 4 0 0

Great to see this ability for "creating processing pipelines for append-only streams of data...We believe it is a foundational building block for building IoT applications on PostgreSQL that should be available to everyone, similar to pg_cron, pg_parquet, and pg_partman."

1 year ago 14 2 0 0

There are many incremental processing solutions, but they seem to never quite do what I need.

I decided to build an extension that just keeps running the same command in Postgres with different parameters to do fast, reliable incremental data processing.

That's pg_incremental.

1/n

1 year ago 12 3 1 0

Importing files: For cases when data is loaded through remote repositories in s3, functions can look for new files and pg_incremental can load these.

1 year ago 1 0 0 0

Exporting files: For folks archiving or exporting data, like maybe individual partitions, pg_incremental can batch data into files and send them to a long term archive.

1 year ago 1 0 1 0

Interval pipelines: Similar to the rollups and aggregates, lots of folks are creating interval range data summaries. Either by date, week, or month pg_incremental can build your summary tables.

1 year ago 1 0 1 0

Posts by Crunchy Data