Create Delta Table From Dataframe Without Schema At External Location

Requirement

In this post, we are going to learn to create a delta table from the dataframe at an external path in Databricks. This scenario comes when we consume data from any file, source database table, etc., at last, we used to have the data in the dataframe. We can store this data in the Delta table created at the external path.

Solution

Let’s create a dataframe with some dummy data.

 val df = spark.createDataFrame(Seq(
("1100", "Person1", "Street1#Location1#City1", null),
("1200", "Person2", "Street2#Location2#City2", "Contact2"),
("1300", "Person3", "Street3#Location3#City3", null),
("1400", "Person4", null, "Contact4"),
("1500", "Person5", "Street5#Location5#City5", null)
)).toDF("id", "name", "address", "contact")

We have also created a database named testdb. If you see the below screenshot, currently we don’t have any table under this database.

Create Delta Table from Dataframe using External Location

 %sql
CREATE TABLE testdb.testDeltaTable2 
USING DELTA
LOCATION '/mnt/blob-storage/testDeltaTable2'

%scala
df.write \
  .format("delta") \
  .mode("overwrite") \
  .save("mnt/blob-storage/")

Here, we are writing an available dataframe named df to a delta table name testdeltatable under database testdb. We are creating a DELTA table using the format option in the command.

Create delta table on an above external location

Now, check the database either from the query or using Data options to verify the delta table.

You can also verify the table is delta or not, using the below show command:

 %sql
show create table testdb.testdeltatable2;

You will see the schema has already been created using DELTA format at an external location.

Wrapping Up

In this post, we have learned to create the delta table using a dataframe. Here, we have a delta table without creating any table schema at the external location.

Sharing is caring!

Subscribe to our newsletter
Loading

Leave a Reply