Requirement :
To Load application properties from text file.
In this tutorial we will learn how to load properties or configs for any spark application.
Solution :
Step 1 : Preparation of required configs
Many times parameters for any application vary from environment to environment. Or two run multiple instance of any application you need different properties file . So we ‘ll create a simple file which will have configuration like below :
url=bigdataprogrammers.com
path=/tmp/test
table_name=data1
db_name=mydb
here every line, first have name of config and then value.There are many methods available to read properties file ,But we’ll discuss the easiest one.
let ‘s save these configs in a file say ,app_prop.txt
here we are not considering the extension of a file , because we are creating a generic way to load the properties. You can have .properties extension.
Below is the output of my properties file .
Step 2 : Reading the file in Spark – Scala
As we have mentioned name of file as app_prop.txt , we are going to load it using from File function of Scala io Source .
we need to make below necessary import
import scala.io.Source
below is the code to load the file
Source.fromFile("app_prop.txt").getLines()
In Production environment , we can keep the name of file as an application argument receive from spark application , In that way you can run multiple instance of any application with different files , passed one at time.
so Idea here is to get the lines and split them on the basis of = and covert them in to scala Map , So that it can be used where needed ,just by passing key name.
here is the code :
var configMap=Source.fromFile("app_prop.txt").getLines().filter(line => line.contains("=")).map{ line => val tkns=line.split("=") if(tkns.size==1){ (tkns(0) -> "" ) }else{ (tkns(0) -> tkns(1)) } }.toMap
In above code we have iterated the lines and spitted the line in to two parts based on the = , so if any value is not present it will treat as blank (“”), other wise it will assign the value ,present at 1 index of splitted array.
and at last we have converted them into map . i.e configMap
Step 3 : Accessing the Configuration /Properties
To use the configs we just need to pass the key in the map . so
use configMap(“your key”)
please refer below screen shot .
For more tutorials on spark click here
for interview questions on Big data click here
For video Tutorials Click here