Requirement
In this post, we are going to learn to create a delta table from the dataframe at an external path in Databricks. This scenario comes when we consume data from any file, source database table, etc., at last, we used to have the data in the dataframe. We can store this data in the Delta table created at the external path.
Solution
Let’s create a dataframe with some dummy data.
val df = spark.createDataFrame(Seq( ("1100", "Person1", "Street1#Location1#City1", null), ("1200", "Person2", "Street2#Location2#City2", "Contact2"), ("1300", "Person3", "Street3#Location3#City3", null), ("1400", "Person4", null, "Contact4"), ("1500", "Person5", "Street5#Location5#City5", null) )).toDF("id", "name", "address", "contact")
We have also created a database named testdb. If you see the below screenshot, currently we don’t have any table under this database.
Create Delta Table from Dataframe using External Location
%sql CREATE TABLE testdb.testDeltaTable2 USING DELTA LOCATION '/mnt/blob-storage/testDeltaTable2' %scala df.write \ .format("delta") \ .mode("overwrite") \ .save("mnt/blob-storage/")
Here, we are writing an available dataframe named df to a delta table name testdeltatable under database testdb. We are creating a DELTA table using the format option in the command.
Create delta table on an above external location
Now, check the database either from the query or using Data options to verify the delta table.
You can also verify the table is delta or not, using the below show command:
%sql show create table testdb.testdeltatable2;
You will see the schema has already been created using DELTA format at an external location.
Wrapping Up
In this post, we have learned to create the delta table using a dataframe. Here, we have a delta table without creating any table schema at the external location.