Requirement:
You have sample dataframe and you want to delete some columns from it.
Solution:
Step 1: Sample Dataframe
use below command:
spark-shell
Note: I am using spark 2.3 version.
To Create a sample dataframe , Please refer Create-a-spark-dataframe-from-sample-data
After following above post ,you can see that students dataframe has been created. You can use this dataframe to perform operations.
Use below command to see the content of dataframe
students.show()
Step 2: Deletion of columns
To delete some columns,refer below code.
in below code we have used drop function , which takes the name of columns which we want to delete.stu
var updated_df=students.drop("percentage","name")
Step 3 : Check Number of columns in new dataframe
You can check the columns using below command
updated_df.columns
Wrapping up:
Sometimes after joining and applying filter ,we might not need some columns in spark dataframe. So to minimise any memory issue or for saving processing time we must eliminate unwanted columns as early as possible.
Don’t forget to subscribe us.
Don’t miss the tutorial on Top Big data courses on Udemy you should Buy