Requirement You have two table named as A and B. and you want to perform all types of join in hive . It will help you to understand, how join works in hive. Solution Step 1: Input Files Download file  and  from here. And place them into a local directory. FileRead More →

#execute a query in Silent(S)Mode     hive -S -e “your hive query” #execute a query     hive -e “your hive query”   #show current database     hive>set hive.cli.print.current.db=true;   #show all column names for a table     hive>set hive.cli.print.header=true;   #to run all pre required commandsRead More →

Requirement In this post, we are going to explore windowing functions in Hive. These are the windowing functions: LEAD LAD FIRST_VALUE LAST_VALUE MIN/MAX/COUNT/AVG OVER Clause Component Involved Hive Sample Data ID FIRST_NAME LAST_NAME DESIGNATION DEPARTMENT SALARY 1001 Jervis Roll Director of Sales Sales 30000 1002 Gordon Mattster Marketing Manager SalesRead More →

Requirement In this post, we are going to explore analytics functions in Hive. These are the following analytics function available in the hive: ROW_NUMBER RANK DENSE_RANK CUME_DIST PERCENT_RANK NTILE Component Involved Hive Sample Data ID FIRST_NAME LAST_NAME DESIGNATION DEPARTMENT SALARY 1001 Jervis Roll Director of Sales Sales 30000 1002 GordonRead More →

Requirement In this post, we will go through the concept of Bucketing in Hive. This post will cover the below-following points about Bucketing: What is Bucket in Hive? How to load data into a Bucketed table? What is its importance? Components Involved HIVE HDFS Sample Data We will use givenRead More →

Requirement There is an uncertain number of columns present in the hive table. Sometimes a table can have many numbers of columns and sometimes it can have few numbers of columns. If we want the value of all the columns from the table, then there is no any challenge asRead More →

Requirement In this post, we are going to understand what is hive_default_partition in hive and why it gets created. Components Involved HDFS HIVE Sample Data We will use below sample data for the task. This is the sample data of employee details. Some employees are the member of company’s sportsRead More →

Requirement Suppose we are having a hive partition table. This table is partitioned by the year of joining. Our requirement is to drop multiple partitions in hive. Components Involved Hive HDFS Sample Data Let’s say we are having given sample data: Here, 1 record belongs to 1 partition as weRead More →

Requirement There are two files which contain employee’s basic information. One file store employee’s details who have joined in the year of 2012 and another is for the employees who have joined in the year of  2013. Now, we want to load files into hive partitioned table which is partitionedRead More →

Requirement Suppose we are having some data in a hive table. The table contains information about company’s quarterly wise profit. Now, the requirement is to find max profit of each company from all quarters. Sample Data The record having 5 columns – company name, quarter 1 as Q1, quarter 2Read More →

Requirement Suppose we have data in Hive table. We want the same data into HBase table. So, our requirement is to migrate the data from Hive to HBase table. Components Involved Hive – Source table HBase – Target Table Solution We cannot load data directly into HBase table from theRead More →

Requirement You have marks of all the students of a class with roll number in CSV file, It is needed to calculate the percentage of each student in hive. Given: Download the sample CSV file  which have 7 columns, 1st column is Roll no and other 6 columns are subject1Read More →

Requirement You have a file which is delimited by multiple characters (%$) and you want to create a table in the hive on top of it.   Solution Step 1: Sample File Create a sample file named as sample_1.txt. Download from here sample_1 (You can skip this step if youRead More →

Requirement You have one table in hive, and it is needed to process the data of that hive table using pig.To load data directly from file we generally use PigStorage(),but to load data from hive table we need different loading function. Let’s go into detail step by step. Solution StepRead More →

Requirement Suppose the source data is in a file. The file format is a text format. The requirement is to load the text file into hive table using Spark. In addition to this, read the data from the hive table using Spark. Therefore, let’s break the task into sub-task: LoadRead More →