2020-08-27 | Watch the video | This folder contains the notebooks used in this tutorial.
Delta Lake’s transaction log brings high reliability, performance, and ACID compliant transactions to data lakes. But exactly how does it accomplish this? Working through concrete examples, we will take a close look at how the transaction logs are managed and leveraged by Delta to supercharge data lakes.
This tutorial notebook was developed using open source Delta in an open source environment.
In this tech talk you will learn:
- Enabling and configuring OSS Delta Lake
- Creating Delta Lake tables
- Using history() to view metadata and table versioning
- How Delta manages the log files
- What goes into the transaction logs for various DML operations
- How Delta constructs snapshots of data
- The small file problem and how to mitigate it
- How to construct time travel queries
- Configuring Delta tables for deleted files and log retention