parallelize

Requirement In this post, we will convert RDD to Dataframe in Spark with Scala. Solution Approach 1: Using Schema Struct Type //Create RDD: val dummyRDD = sc.parallelize(Seq(                             (“1001”, “Ename1”, “Designation1”, “Manager1”)                            ,(“1003”, “Ename2”, “Designation2”, “Manager2”)                            ,(“1001”, “Ename3”, “Designation3”, “Manager3”)                             )) val schema = StructType( StructField(“empno”, StringType, true) ::Read More →

Requirement In this post, we are going to create an RDD and then will read the content in Spark. Solution //Create RDD: val dummyRDD = sc.parallelize(Seq(                                (“1001”, “Ename1”, “Designation1”, “Manager1”)                                ,(“1003”, “Ename2”, “Designation2”, “Manager2”)                                , (“1001”, “Ename3”, “Designation3”, “Manager3”)                              )) //Read RDD dummyRDD.collect().foreach(println(_)) //Read specific Column: dummyRDD.collect().foreach(data => println(data._1,Read More →