Pass variables from shell script to hive script

Pass variables from shell script to hive script

Requirement

You have one hive script which is expecting some variables which need to be passed from a shell script.Say name of hive scripts is daily_audit.hql .it is expecting three variables which are as follows
• schema
• tablename
• total_emp

Solution

Step 1:

Let’s see content of daily_audit.hql

daily_audit.hql
 
select * from ${hiveconf:schema}.${hiveconf:tablename} where total_emp>${hiveconf:no_of_employees};

I can say that three variable are required to be declared in a shell script. I recommended you to focus only on variables instead of logic in this HQL.

Step 2: Assignation of variables

Let’s declare these three variables in shell script

 
 
myschema=bdp
mytablename=infostore
noOftotal_emp=5000

You can change these variables when you need to get information from other schema or table. And that’s what the use of assigning variables in a shell script. In real time these values are assigned from the output of another process.

Step 3: Call HQL

Once assignation is complete now we can pass them while calling HQL.
So here is the command

 
 
hive -f "/root/local_bdp/posts/Pass-variables-from-shell-script-to-hive-script/daily_audit.hql" --hiveconf schema=$myschema --hiveconf tablename=$mytablename --hiveconf no_of_employees=$noOftotal_emp;

In above command you can see that using $ sign we are taking values of a particular variable and assigning it to the variable which is defined in HQL. For example
“Schema” is present in hive script but “myschema” is defined in a shell script. So you have to write like schema=$myschema where the first one is variable of hive script and the second one is variable of shell script. You must use -–hiveconf for each variable while calling hive script.

Another Way

Instead of passing variable side by side, we can use parameter file which has all the variables.
Let’s have one file hiveparam.txt

hiveparam.txt
 
  1. set schema=bdp;
  2. set tablename=infostore;
  3. set no_of_employees=5000;

Define all variables using set command.
And while calling HQL simultaneously you should call this file.
Below is the command.

 
 
hive -f "/root/local_bdp/posts/Pass-variables-from-shell-script-to-hive-script/daily_audit.hql" -i "/root/local_bdp/posts/Pass-variables-from-shell-script-to-hive-script/hiveparam.txt"

Use –i to call parameter file.

Wrapping Up

In real time projects we use hive partitioned table. It may be partitioned on a date. As we process data daily.so every day one variable needs to be changed and we can’t hard-code in HQL script .in that case we can assign parameter(s) in a shell script.

Keep learning 

3
0

Join in hive with example

Requirement You have two table named as A and B. and you want to perform all types of join in ...
Read More

Join in pyspark with example

Requirement You have two table named as A and B. and you want to perform all types of join in ...
Read More

Join in spark using scala with example

Requirement You have two table named as A and B. and you want to perform all types of join in ...
Read More

Java UDF to convert String to date in PIG

About Code Many times it happens like you have received data from many systems and each system operates on a ...
Read More
/ java udf, Pig, pig, pig udf, string to date, udf

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.