Requirement
In this post, we are going to learn to create a delta table from the dataframe in Databricks. This scenario comes when we consume data from any file, source database table, etc., at last, we used to have the data in a dataframe. We can store this data in the Delta table.
Solution
Let’s create a dataframe with some dummy data.
val df = spark.createDataFrame(Seq( ("1100", "Person1", "Street1#Location1#City1", null), ("1200", "Person2", "Street2#Location2#City2", "Contact2"), ("1300", "Person3", "Street3#Location3#City3", null), ("1400", "Person4", null, "Contact4"), ("1500", "Person5", "Street5#Location5#City5", null) )).toDF("id", "name", "address", "contact")
We have also created a database named testdb. If you see the below screenshot, currently we don’t have any table under this database.
Create Delta Table from Dataframe
df.write.format("delta").saveAsTable("testdb.testdeltatable")
Here, we are writing an available dataframe named df to a delta table name testdeltatable under database testdb. We are creating a DELTA table using the format option in the command.
Now, check the database either from the query or using Data options to verify the delta table.
You can also check the versions of the table from the history tab.
You can also verify the table is delta or not, using the below show command:
%sql show create table testdb.testdeltatable;
You will see the schema has already been created and using DELTA format.
Wrapping Up
In this post, we have learned to create the delta table using a dataframe. Here, we have a delta table without creating any table schema. The created table is a managed table. You can see the next post for creating the delta table at the external path.