Requirement You have two table named as A and B. and you want to perform all types of join in spark using scala. It will help you to understand, how join works in spark scala. Solution Step 1: Input Files Download file  and  from here. And place them into a localRead More →

Requirement Suppose we have a dataset which is in CSV format. We want to read the file in spark using Scala. So the requirement is to create a spark application which read CSV file in spark data frame using Scala. Components Involved Following components are involved: Spark RDD/Data Frame ScalaRead More →

Requirement Suppose we are having a source file, which contains basic information about Employees like employee number, employee name, designation, salary etc. The requirement is to find max value in spark RDD using Scala. With this requirement, we will find out the maximum salary, the second maximum salary of anRead More →

Requirement Suppose we are having a text format data file which contains employees basic details. When we load this file in Spark, it returns an RDD. Our requirement is to find the number of partitions which has created just after loading the data file and see what records are storedRead More →

Requirement In spark-shell, it creates an instance of spark context as sc. Also, we don’t require to resolve dependency while working on spark shell. But it all requires if you move from spark shell to IDE. So how to create spark application in IntelliJ? In this post, we are goingRead More →

Requirement You have marks of all the students of a class with roll number in CSV file, It is needed to calculate the percentage of each student in spark using Scala. Given : Download the sample CSV file  Which have 7 columns, 1st column is Roll no and other 6Read More →

Requirement Assume you have the hive table named as reports. It is required to process this dataset in spark. Once we have data of hive table in spark data frame we can further transform it as per the business needs. So let’s try to load hive table in spark dataRead More →