Data lakes and Icebergs in Postgres with pg_lake

Fletcher

50-Minute Talk

pg_lake is a set of extensions that adds comprehensive support for creating and querying Iceberg tables in Postgres, and extends existing commands to query/import/export files in data lakes. Underneath, it accelerates analytics queries using DuckDB, which runs in a separate Postgres-protocol speaking process and can give enormous performance boosts for analytical queries.

The Iceberg specification defines a way to store analytics-optimized tables in object stores like Amazon S3. Pg_lake implements Iceberg in a way that's deeply integrated into Postgres, such that you create Iceberg tables as easily as Postgres tables, perform transactions across Postgres and Iceberg tables, and use almost all Postgres features as usual.

This talk will go through the architectural decisions we made in developing the pg_lake extension, and how we navigated our way around the various extension APIs.

Gold Sponsors

EDB

Microsoft

AWS

Huawei

Silver Sponsors

Percona

Fujitsu

HighGo

Duboce Labs, Inc.