Create Delta Table with Partition in Databricks

Requirement

In the last post, we have learned how to create Delta Table from Path in Databricks. In this post, we will learn how to create Delta Table with the partition in Databricks.

Solution

The partition is basically split the data and then stored. You can learn more about the Partition here https://bigdataprogrammers.com/partition-in-hive/.

Create Table with Partition

For creating a Delta table, below is the template:

 CREATE TABLE <table_name> (
<column name> <data type>,
<column name> <data type>,
..)
Partition By (
      <partition_column name> <data type>
)
USING DELTA
Location '<Path of the data>';

With the same template, let’s create a table for the below sample data:

Sample Data

empno

ename

designation

manager

hire_date

sal

deptno

location

9369

SMITH

CLERK

7902

12/17/1980

800

20

BANGALORE

9499

ALLEN

SALESMAN

7698

2/20/1981

1600

30

HYDERABAD

9521

WARD

SALESMAN

7698

2/22/1981

1250

30

PUNE

9566

TURNER

MANAGER

7839

4/2/1981

2975

20

MUMBAI

9654

MARTIN

SALESMAN

7698

9/28/1981

1250

30

CHENNAI

9369

SMITH

CLERK

7902

12/17/1980

800

20

KOLKATA

 CREATE TABLE employee_delta (
      empno INT,
      ename STRING,
      manager INT,
      hire_date DATE,
      sal BIGINT,
      deptno INT,
      location STRING
) PARTITION BY (
      designation STRING
)
USING DELTA
Location '/mnt/bdpdatalake/blob-storage/';

Here, we have created the table with partition by Designation. There will be multiple subfolders created under the Location path with the name like CLEAR, SALESMAN.

Wrapping Up

In this post, we have learned how to create a Delta table with a partition. The partition is useful when we have huge data against the partition column value, The processing will be faster using the partition. It is also important to understand the scenarios, where to use the partition or not.

Sharing is caring!

Subscribe to our newsletter
Loading

Leave a Reply