In this post, we will learn about Delta Lake in Databricks. In addition to this, will see what are the benefits of using Delta Lake.
Overview of Delta Lake
Delta Lake is an open-source storage layer that provides ACID transactions. It is provided by Databricks and runs on top of the data/files on the existing data lake.
Features
Let’s know about the features provided by Delta Lake.
- Provides ACID Transaction on Spark
- Provides Upsert and Deletes operation on the data, hence enabling Change Data Capture (CDC) and Slowly Changing Dimension (SCD) properties.
- Provides Delta Tables on top of Delta Lake for full, delta, and historical load.
- Data schema validation while inserting into a table.
- Data Versioning – can track the historical data for audit, enable rollback.
Wrapping Up
In this post, we got some understanding of Delta Lake and its features. This understanding will help to learn and do exercises on upcoming posts.