Requirement
In this post, we will learn how to store the processed dataframe to delta table in databricks in append mode. The append mode helps when we need to store the new data into an existing table without impacting old data in the table.
Solution
For this exercise, we will use the below data:
empno | ename | designation | manager | hire_date | sal | deptno | location |
9369 | SMITH | CLERK | 7902 | 12/17/1980 | 800 | 20 | BANGALORE |
9499 | ALLEN | SALESMAN | 7698 | 2/20/1981 | 1600 | 30 | HYDERABAD |
9521 | WARD | SALESMAN | 7698 | 2/22/1981 | 1250 | 30 | PUNE |
9566 | TURNER | MANAGER | 7839 | 4/2/1981 | 2975 | 20 | MUMBAI |
9654 | MARTIN | SALESMAN | 7698 | 9/28/1981 | 1250 | 30 | CHENNAI |
9369 | SMITH | CLERK | 7902 | 12/17/1980 | 800 | 20 | KOLKATA |
First, load this data into a dataframe using the below code:
val file_location = "/FileStore/tables/emp_data1-3.csv" val df = spark.read.format("csv") .option("inferSchema", "true") .option("header", "true") .option("sep", ",") .load(file_location) display(df)
Save in Delta in Append mode
df.write.mode("append").format("delta").saveAsTable(permanent_table_name)
Run same code to save as table in append mode, this time when you check the data in the table, it will give 12 instead of 6.
Wrapping Up
In this post, we have stored the dataframe data into a delta table with append mode that means the existing data in the table is untouched.