DELTA TABLE Utility Command

Requirement

The purpose of this post is to explore all of the commands available in the Delta table of Databricks. We used some of the commands in our previous post. If you checked it, you might have seen them. Our next step will be to list all the commands and how they can be used.

Solution

There are many useful commands in the Delta table. Therefore, all those commands should be kept in a single place so that they can be used whenever needed. There are many times when we know the approach but forget the exact command to execute.

You can check DBUtils Commands in Databricks

 

USE CASE

DESCRIPTION

COMMAND

CREATE

Use to create delta table managed or external.

For managed Table

CREATE  TABLE delta_table_name (
……

) USING delta

 

For External Table

CREATE  TABLE delta_table_name (
……

) USING delta

LOCATION ‘<path>’

SELECT FROM DELTA FILE

Use to query on delta files directly

SELECT * FROM delta.`<path>`

 

DESCRIBE HISTORY

Use to get the delta table history

DESCRIBE HISTORY delta.'<path>’
OR
DESCRIBE HISTORY <delta_table>

UPDATE

Use to update the data in the delta table

UPDATE <table_name> SET <column_name> = <value> [WHERE <CONDITION>]

MERGE

Merging data into Delta Table

MERGE INTO <target_delta_table>

USING <source_table>
ON <merge_condition>
WHEN MATCHED

THEN UPDATE *

DELETE

Delete the data from the Delta table

DELETE FROM delta.'<path>’
OR
DELETE FROM <delta_table_name>

CONVERT

Use to convert any parquet table to a Delta table

For Non-Partition Table

CONVERT TO DELTA parquet.'<parquet_table_path>’

For Partition Table

CONVERT TO DELTA parquet.'<parquet_table_path>’ PARTITIONED BY (<column name> <column datatype>)

VACUUM

Use to remove older version of Delta table

VACUUM delta.'<path>’
OR

VACUUM <delta_table_name>

OPTIMIZE

Use to compact the files

OPTIMIZE <delta_table_name>

CLONE

Use to clone the data of delta table to another table or path
deep clone – Copy the data to target table/path with metadata. There is no dependency on the Source Table/path after copy

Shallow clone – It points the reference to the source table/path, hence if any data is deleted at the source will affect the shallow clone target
 

For Deep Clone

CREATE [OR REPLACE TABLE IF NOT EXISTS] delta.'<target_path>’ CLONE delta.'<source_path>’

For Shallow Clone:

CREATE [OR REPLACE TABLE IF NOT EXISTS] delta delta.'<target_path>’ SHALLOW CLONE delta.'<source_path>’

RESTORE

Use to restore delta table to a specific state using version or timestamp

RESTORE TABLE delta.'<path>’ TO VERSION AS OF <version>

RESTORE TABLE delta.'<path>’ TO TIMESTAMP AS OF <timestamp>

Note: Restore will fail if the data files are deleted manually or vacuum command.

DESCRIBE DETAIL

Use to get the details of the Delta table

DESCRIBE DETAIL delta.'<path>’
OR
DESCRIBE DETAIL <delta_table_name>

SHOW CREATE TABLE

Use to check the table creation details – schema and location

SHOW CREATE TABLE <delta_table_name>

Wrapping Up

This post contains a list of all the useful Delta table utility commands. This list can be referred to and used as per your requirements. Please note that some commands require a specific Databricks Runtime Version. Thanks for reading this list. I hope you find it helpful.

Sharing is caring!

Subscribe to our newsletter
Loading

Leave a Reply