Pass variables from shell script to hive script

Requirement

You have one hive script which is expecting some variables which need to be passed from a shell script.Say name of hive scripts is daily_audit.hql .it is expecting three variables which are as follows
• schema
• tablename
• total_emp

Solution

Step 1:

Let’s see content of daily_audit.hql

daily_audit.hql
 
  1. SELECT * FROM ${hiveconf:schema}.${hiveconf:tablename} WHERE total_emp>${hiveconf:no_of_employees};

I can say that three variable are required to be declared in a shell script. I recommended you to focus only on variables instead of logic in this HQL.

Step 2: Assignation of variables

Let’s declare these three variables in shell script

 
 
  1. myschema=bdp
  2. mytablename=infostore
  3. noOftotal_emp=5000

You can change these variables when you need to get information from other schema or table. And that’s what the use of assigning variables in a shell script. In real time these values are assigned from the output of another process.

Step 3: Call HQL

Once assignation is complete now we can pass them while calling HQL.
So here is the command

 
 
  1. hive -f "/root/local_bdp/posts/Pass-variables-from-shell-script-to-hive-script/daily_audit.hql" --hiveconf schema=$myschema --hiveconf tablename=$mytablename --hiveconf no_of_employees=$noOftotal_emp;

In above command you can see that using $ sign we are taking values of a particular variable and assigning it to the variable which is defined in HQL. For example
“Schema” is present in hive script but “myschema” is defined in a shell script. So you have to write like schema=$myschema where the first one is variable of hive script and the second one is variable of shell script. You must use -–hiveconf for each variable while calling hive script.

Another Way

Instead of passing variable side by side, we can use parameter file which has all the variables.
Let’s have one file hiveparam.txt

hiveparam.txt
 
  1. set schema=bdp;
  2. set tablename=infostore;
  3. set no_of_employees=5000;

Define all variables using set command.
And while calling HQL simultaneously you should call this file.
Below is the command.

 
 
  1. hive -f "/root/local_bdp/posts/Pass-variables-from-shell-script-to-hive-script/daily_audit.hql" -i "/root/local_bdp/posts/Pass-variables-from-shell-script-to-hive-script/hiveparam.txt"

Use –i to call parameter file.

Wrapping Up

In real time projects we use hive partitioned table. It may be partitioned on a date. As we process data daily.so every day one variable needs to be changed and we can’t hard-code in HQL script .in that case we can assign parameter(s) in a shell script.

Keep learning 

Load CSV file into hive AVRO table

Requirement You have comma separated(CSV) file and you want to create Avro table in hive on top of it, then ...
Read More

Load CSV file into hive PARQUET table

Requirement You have comma separated(CSV) file and you want to create Parquet table in hive on top of it, then ...
Read More

Hive Most Asked Interview Questions With Answers – Part II

What is bucketing and what is the use of it? Answer: Bucket is an optimisation technique which is used to ...
Read More
/ hive, hive interview, interview-qa

Spark Interview Questions Part-1

Suppose you have a spark dataframe which contains millions of records. You need to perform multiple actions on it. How ...
Read More

Leave a Reply