Big Data Engineering Interview Questions

Answer : while reading in dataframe we need to set an option of header to true like below :-

 var df1=spark.read.option("header",true).csv("path")

Answer : To create a schema we need to have one struct type and then add schema using .add as per our fields :

 val my_csv_schema = new StructType()
.add("id",IntegerType,true)
.add("sal",DoubleType,true)
.add("name",StringType,true)

Once the schema is created we can pass it using .schema

   var emp_data = spark.read.format("csv")
.option("header", "true")
.schema(my_csv_schema).load("path_to_csv")

Answer: we can use command to check the existence of table

 exists 'Table_Name'

Answer : we can have a list of columns and alter the list to select column as below

var all_cols=df1.columns
var all_cols_cast=all_cols.map(x => x.cast(“string”))
var df1_new=df1.select(all_cols_cast:_*)

df1_new will have all the columns with data type string. This is the generic way of doing it .

If you need any help while coding and learning spark, connect with our experts here https://bigdataprogrammers.com/get-help-from-big-data-expert/