Requirement In the last post, we have demonstrated how to load JSON data in Hive non-partitioned table. This time having the same sample JSON data. The requirement is to load JSON Data into Hive Partitioned table using Spark. The hive table will be partitioned by some column(s). The below taskRead More →

Requirement You have one hive script which is expecting some variables which need to be passed from a shell script.Say name of hive scripts is daily_audit.hql .it is expecting three variables which are as follows • schema • tablename • total_emp Solution Step 1: Let’s see content of daily_audit.hql daily_audit.hqlRead More →

Requirement: Generally we receive data from different sources which usually have different types of date formats. When we create a hive table on top of these data, it becomes necessary to convert them into date format which is supported by hive. Hive support yyyy-MM-dd date format. So output format ofRead More →

Requirement You have one hive table named as infostore which is present in bdp more application is connected to your application, but it is not allowed to take the data from hive table due to security reasons. And it is required to send the data of infostore table intoRead More →

Requirement Suppose there is a source data which is in JSON format. The requirement is to load JSON data in Hive non-partitioned table using Spark. Let’s break the requirement into two task: Load JSON data in spark data frame and read it Store it in a hive non-partition table ComponentsRead More →

Requirement Suppose there is a source data, which is required to store in hive partition table. So our requirement is to store the data in the hive table with static and dynamic partition. With an understanding of partition in the hive, we will see where to use the static andRead More →

Requirement Suppose you are having an XML formatted data file. This source file contains some empty tag. The requirement is to parse XML data in Hive and read data with handling some tag which is empty in the source data. Components Involved Hive Maven Java Solution There are many solutionsRead More →

Requirement You have one table in hive with one column and you want to split this column into multiple columns and store the results into another hive table. Solution Assume the name of hive table is “transact_tbl” and it has one column named as “connections”, and values in connections column areRead More →

Requirement You have one CSV file which is present at Hdfs location, and you want to create a hive layer on top of this data, but CSV file is having two headers on top of it, and you don’t want them to come into your hive table, so let’s solveRead More →