The Who, What and Why of Data Lakehouse Table Formats

Presented by

Alex Merced, Developer Advocate, Dremio

About this talk

In the rapidly evolving landscape of big data, Data Lakehouse is heralding a new age of unified analytics, blending the best elements of data lakes and data warehouses. Central to this convergence is the need for advanced table formats that can meet the demands of scalability, performance, and data reliability. This webinar dives deep into the world of Data Lakehouse table formats, specifically focusing on Apache Iceberg, Delta Lake, and Apache Hudi. Who should watch this video? Data engineers, data architects, data analysts, and other professionals interested in modernizing their data platform or seeking deeper insights into the technicalities and advantages of these advanced table formats. Key Takeaways: - Introduction to Data Lakehouse: Explore the genesis of the Data Lakehouse paradigm, its significance, and how it’s reshaping the way organizations think about big data storage and analytics. - Demystifying Apache Iceberg, Delta Lake, and Apache Hudi: Understand the intricacies of these popular table formats, their architectural nuances, and how they differ from traditional table structures. - Features Spotlight: Delve into the unique feature sets that each format brings to the table - from ACID transactions, time-travel queries, to efficient upserts and scalability features. - The Relevance Quotient: Understand why these table formats matter in today's data-driven world. Learn about their roles in ensuring data consistency, improving query performance, and facilitating near real-time analytics on large datasets. - Best Practices and Use Cases: Explore real-world scenarios where organizations have leveraged these formats to transform their data analytics operations, and glean best practices for successful implementation and optimization. Equip yourself with the knowledge to harness their power, ensuring a robust and efficient data infrastructure for your organization.
Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (103)
Subscribers (4469)
Dremio is the easy and open data lakehouse, providing self-service analytics with data warehouse functionality and data lake flexibility across all of your data. Dremio increases agility with a revolutionary data-as-code approach that enables Git-like data experimentation, version control, and governance.