
Requirement In this post, we will convert RDD to Dataframe in Spark with Scala. Solution Approach 1: Using Schema Struct Type //Create RDD: val dummyRDD = sc.parallelize(Seq(                             (“1001”, “Ename1”, “Designation1”, “Manager1”)                            ,(“1003”, “Ename2”, “Designation2”, “Manager2”)                            ,(“1001”, “Ename3”, “Designation3”, “Manager3”)                             )) val schema = StructType( StructField(“empno”, StringType, true) ::Read More →

Requirement Let’s say we are getting data from multiple sources, but we need to ingest these data into a single target table.  These data can have different schemas. We want to merge these data and load/save it into a table. Sample Data Emp_data1.csv empno,ename,designation,manager,hire_date,sal,deptno,location 9369,SMITH,CLERK,7902,12/17/1980,800,20,BANGALORE 9499,ALLEN,SALESMAN,7698,2/20/1981,1600,30,HYDERABAD 9521,WARD,SALESMAN,7698,2/22/1981,1250,30,PUNE 9566,TURNER,MANAGER,7839,4/2/1981,2975,20,MUMBAI 9654,MARTIN,SALESMAN,7698,9/28/1981,1250,30,CHENNAI 9369,SMITH,CLERK,7902,12/17/1980,800,20,KOLKATARead More →