Requirement
In the last post, we have learned how to create Delta Table in Databricks. In this post, we will learn how to create Delta Table from Path in Databricks.
Solution
Let’s first understand what is the use of creating a Delta table with Path. Using this, the Delta table will be an external table that means it will not store the actual data. The actual data will be available at the path (can be S3, Azure Gen2).
The advantage of using Path is if the table gets drop, the data will not be lost as it is available in the storage.
Create Table from Path
For creating a Delta table, below is the template:
CREATE TABLE <table_name> ( <column name> <data type>, <column name> <data type>, ..) USING DELTA Location '<Path of the data>';
With the same template, let’s create a table for the below sample data:
Sample Data
empno | ename | designation | manager | hire_date | sal | deptno | location |
9369 | SMITH | CLERK | 7902 | 12/17/1980 | 800 | 20 | BANGALORE |
9499 | ALLEN | SALESMAN | 7698 | 2/20/1981 | 1600 | 30 | HYDERABAD |
9521 | WARD | SALESMAN | 7698 | 2/22/1981 | 1250 | 30 | PUNE |
9566 | TURNER | MANAGER | 7839 | 4/2/1981 | 2975 | 20 | MUMBAI |
9654 | MARTIN | SALESMAN | 7698 | 9/28/1981 | 1250 | 30 | CHENNAI |
9369 | SMITH | CLERK | 7902 | 12/17/1980 | 800 | 20 | KOLKATA |
CREATE TABLE employee_delta ( empno INT, ename STRING, designation STRING, manager INT, hire_date DATE, sal BIGINT, deptno INT, location STRING ) USING DELTA Location '/mnt/bdpdatalake/blob-storage/';
Here, The location will have the actual data in the parquet format.
Wrapping Up
In this post, we have learned how to create a Delta table using the path. We have also learned the advantage of using Path or an external Delta table. This approach of storing and creating tables used in many of the projects.