Apache Iceberg's Best Secret: A Guide to Metadata Tables

Presented by

Szehon Ho, Software Engineer, Apple

About this talk

Apache Iceberg’s rich metadata is its secret sauce, powering core features like time travel, query optimizations, and optimistic concurrency handling. But did you know that this metadata is accessible to all, via easy-to-use system tables? This talk will walk through real-life examples of using metadata tables to get even more out of Iceberg and address questions such as: - What is the last partition updated and when? - Why are there too many small files? - What Iceberg maintenance procedures can give us better query performance? - Can we start building more advanced systems like data audit and data quality? - How many null values are being added per hour? - What is the latency of data ingest over time? - We will also cover metadata table performance tips and tricks, and ongoing improvements in the community. Whether you are already using Iceberg metadata tables or interested in getting started, watch this talk to learn how this under-utilized feature can help manage data tables more effectively than ever before.
Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (103)
Subscribers (4466)
Dremio is the easy and open data lakehouse, providing self-service analytics with data warehouse functionality and data lake flexibility across all of your data. Dremio increases agility with a revolutionary data-as-code approach that enables Git-like data experimentation, version control, and governance.