Requirement
In this post, we are going to read a JSON file using Spark and then load it into a Delta table in Databricks.
![]()
Solution
We can use the below sample data for the exercise. In our case, we have placed this file is located in FilteStore in the Databricks cluster:
[
{
"empno": "9369",
"ename": "SMITH",
"designation": "CLERK",
"manager": "7902",
"hire_date": "12/17/1980",
"sal": "800",
"deptno": "20",
"location": "BANGALORE"
},
{
"empno": "9499",
"ename": "ALLEN",
"designation": "SALESMAN",
"manager": "7698",
"hire_date": "2/20/1981",
"sal": "1600",
"deptno": "30",
"location": "HYDERABAD"
},
{
"empno": "9521",
"ename": "WARD",
"designation": "SALESMAN",
"manager": "7698",
"hire_date": "2/22/1981",
"sal": "1250",
"deptno": "30",
"location": "PUNE"
},
{
"empno": "9566",
"ename": "TURNER",
"designation": "MANAGER",
"manager": "7839",
"hire_date": "4/2/1981",
"sal": "2975",
"deptno": "20",
"location": "MUMBAI"
},
{
"empno": "9654",
"ename": "MARTIN",
"designation": "SALESMAN",
"manager": "7698",
"hire_date": "9/28/1981",
"sal": "1250",
"deptno": "30",
"location": "CHENNAI"
},
{
"empno": "9369",
"ename": "SMITH",
"designation": "CLERK",
"manager": "7902",
"hire_date": "12/17/1980",
"sal": "800",
"deptno": "20",
"location": "KOLKATA"
}
]We have sample data in a JSON file available at the storage account path
Step 1: Load JSON File in Dataframe
In this step, we will load the JSON file in a Spark Dataframe using the below code:
%scala
val jsonDf = spark.read.option("multiline", "true").json("/FileStore/tables/emp_data1.json")
display(jsonDf)Here, we have used Option to handle multiline JSON.
![]()
Step 2: Create Delta Table from Dataframe
Once we loaded the JSON data into Dataframe, we can create a delta table using the below command:
jsonDf.write.mode("overwrite").format("delta").saveAsTable("testdb.jsonDataTable")Data Validation
Let’s check the table whether data was loaded correctly in the delta table or not:
![]()
You can also check the table structure using below query in SQL:
%sql show create table testdb.jsondatatable;
![]()
Wrapping Up
In this post, we have learned how to read JSON data in a dataframe and then create a Delta table to store the data.