Hive

Requirement You have one hive table named as infostore which is present in bdp schema. It is needed to get the data into Excel file. Solution. Let’s say the location where output file should present is /root/local_bdp/posts/export-hive-data-into-file Step 1: Create Output directory mkdir /root/local_bdp/posts/export-hive-data-into-file/output Step 2: Go to hive CLIRead More →

  #execute a query in Silent(S)Mode hive -S -e “your hive query”   #execute a query hive -e “your hive query”   #show current database hive>set hive.cli.print.current.db=true;   #show all column names for a table hive>set hive.cli.print.header=true;   #to run all pre required commands in one go 1.create .hiverc fileRead More →

Requirement In this post, we are going to explore windowing functions in Hive. These are the windowing functions: LEAD LAD FIRST_VALUE LAST_VALUE MIN/MAX/COUNT/AVG OVER Clause Component Involved Hive Sample Data ID FIRST_NAME LAST_NAME DESIGNATION DEPARTMENT SALARY 1001 Jervis Roll Director of Sales Sales 30000 1002 Gordon Mattster Marketing Manager SalesRead More →

Requirement In this post, we are going to explore analytics functions in Hive. These are the following analytics function available in the hive: ROW_NUMBER RANK DENSE_RANK CUME_DIST PERCENT_RANK NTILE Component Involved Hive Sample Data ID FIRST_NAME LAST_NAME DESIGNATION DEPARTMENT SALARY 1001 Jervis Roll Director of Sales Sales 30000 1002 GordonRead More →

Requirement In this post, we will go through the concept of Bucketing in Hive. This post will cover the below-following points about Bucketing: What is Bucketing in Hive? How to load data into a Bucketed table? What is its importance? Components Involved HIVE HDFS Sample Data We will use theRead More →

Requirement There is an uncertain number of columns present in the hive table. Sometimes a table can have many numbers of columns and sometimes it can have few numbers of columns. If we want the value of all the columns from the table, then there is no any challenge asRead More →

Requirement In this post, we are going to understand what is hive_default_partition in hive and why it gets created. Components Involved HDFS HIVE Sample Data We will use below sample data for the task. This is the sample data of employee details. Some employees are the member of company’s sportsRead More →

Requirement Suppose we are having a hive partition table. This table is partitioned by the year of joining. Our requirement is to drop multiple partitions in hive. Components Involved Hive HDFS Sample Data Let’s say we are having given sample data: Here, 1 record belongs to 1 partition as weRead More →

Requirement There are two files which contain employee’s basic information. One file store employee’s details who have joined in the year of 2012 and another is for the employees who have joined in the year of  2013. Now, we want to load files into hive partitioned table which is partitionedRead More →

Requirement Suppose we are having some data in a hive table. The table contains information about company’s quarterly wise profit. Now, the requirement is to find max profit of each company from all quarters. Sample Data The record having 5 columns – company name, quarter 1 as Q1, quarter 2Read More →