Hive (Page 2)

Requirement You have one table in hive, and it is needed to process the data of that hive table using pig. To load data directly from the file we generally use PigStorage(), but to load data from hive table we need different loading function. Let’s go into detail step byRead More →

Requirement Suppose the source data is in a file. The file format is a text format. The requirement is to load the text file into a hive table using Spark. In addition to this, read the data from the hive table using Spark. Therefore, let’s break the task into sub-tasks:Read More →

Requirement You have one hive script which is expecting some variables. The variables need to be passed from a shell script. Say the name of hive script is daily_audit.hql. It is expecting three variables which are as follows: • schema • tablename • total_emp Solution Step 1: Hive Script Let’sRead More →

Requirement: Generally we receive data from different sources which usually have different types of date formats. When we create a hive table on top of these data, it becomes necessary to convert them into date format which is supported by hive. Hive support yyyy-MM-dd date format. So output format ofRead More →

Requirement You have a comma separated file and you want to create an ORC formatted table in hive on top of it, then follow the below-mentioned steps. Solution Step 1: Sample CSV File Create a sample CSV file named as sample_1.csv file. Download from here (You can skip this stepRead More →

Requirement You have one hive table named as infostore which is present in bdp schema. One more application is connected to your application, but it is not allowed to take the data from hive table due to security reasons. It is required to send the data of infostore table intoRead More →

Requirement Suppose you are having an XML formatted data file. This file contains some empty tag. The requirement is to parse XML data in Hive and assign any default value to the empty tags. Components Involved Hive Maven Java Solution There are many solutions for parsing XML data into hiveRead More →

Requirement Suppose, you have one table in hive with one column and you want to split this column into multiple columns and then store the results into another Hive table. Solution Assume the name of hive table is “transact_tbl” and it has one column named as “connections”, and values in connectionsRead More →

Requirement If you have comma separated file and you want to create a table in the hive on top of it (need to load CSV file in hive), then follow the below steps. Solution Step 1: Sample CSV File Create a sample CSV file named as sample_1.csv. You can downloadRead More →