Requirement
In this post, we are going to learn how to check if Dataframe is Empty in Spark. This is a very important part of the development as this condition actually decides whether the transformation logic will execute on the Dataframe or not.
![]()
Solution
Let’s first understand how this can cause you and why it is important to check empty.
When you are reading/loading data into a dataframe, and if there is no data, it’s better not to process. If you don’t check, it is not worth running multiple transformations and actions on this as it is running on empty data.
First, create an empty dataframe:
![]()
There are multiple ways to check if Dataframe is Empty. Most of the time, people use count action to check if the dataframe has any records.
Approach 1: Using Count
![]()
Approach 2: Using head and isEmpty
![]()
Approach 3: Using take and isEmpty
![]()
Approach 4: Convert to RDD and isEmpty
![]()
Full Code Snippet
val df = spark.emptyDataFrame
// Approach 1: Using Count
if(df.count() > 1)
println("DF is not Empty")
else
println("DF is Empty")
// Approach 2: Using head and isEmpty
if(df.head(1).isEmpty)
println("DF is not Empty")
else
println("DF is Empty")
// Approach 3: Using take and isEmpty
if(df.take(1).isEmpty)
println("DF is not Empty")
else
println("DF is Empty")
// Approach 4: Convert to RDD and isEmpty
if(df.rdd.isEmpty)
println("DF is not Empty")
else
println("DF is Empty")Wrapping Up
In this post, we have leant to check if dataframe is empty or not. This can be perform by many ways, but need to pick based on the performance wise.