Pig

Requirement You have marks of all the students of a class with a roll number in CSV file, It is needed to calculate the percentage of marks of students using Pig. Given: Download the sample CSV file  which have 7 columns, 1st column is Roll no and other 6 columnsRead More →

Problem 1 Write a pig script to calculate the sum of profits earned by selling a particular product. Below is the query to create data into hive table. CREATE SCHEMA IF NOT EXISTS bdp; CREATE TABLE bdp.profits (product_id INT,profit BIGINT); INSERT INTO TABLE bdp.profits VALUES (‘123′,’1365’),(‘124′,’3253’),(‘125′,’91522’), (‘123′,’51842’),(‘127′,’19616’),(‘128′,’2433’), (‘127′,’182652’),(‘130′,’21632’),(‘122′,’21632’), (‘127′,’21632’),(‘135′,’21632’),(‘123′,’21632’),(‘135′,’3282’); SolutionRead More →

Requirement Assume you have the XML file which is transferred to your local system by some other application. The file has customer’s data and it is needed to process this data using Pig. But the challenge here is that file is not simple text or CSV file, it is theRead More →

Requirement Assume that you want to load file (which have a pipe(|) separated values) in pig and store the output delimited by a comma (‘,’).  Solution Follow the below steps: Step 1: Sample file Create a sample file named as sample_1.txt file. If you have any sample data with you,Read More →

Requirement Assume that you want to load TSV(tab separated values) file in pig and store the output delimited by a pipe (‘|’). Solution Follow the below steps: Step 1: Sample TSV file Create a sample TSV file named as sample_1.tsv file. If you have any sample data with you, thenRead More →

Requirement: In source data, you have user’s information of mobile connection type and Id. You have four types of possible connection “POSTP, PREP, CLS, PEND”. But it is required to get Id of only those users whose connection type is in “POSTP, PREP, blank or null”. If the blank isRead More →

Requirement You have two tables named as A and B and you want to perform all types of join in Pig. It will help you to understand, how join works in pig. Solution Step 1: Input Files Download file  and  from here. And place them into a local directory. File ARead More →

Requirement Assume that you want to load CSV file in pig and store the output delimited by a pipe (‘|’). Solution Please follow the below steps:- Step 1: Sample CSV file Create a sample CSV file named as sample_1.csv. If you have any sample data with you, then put theRead More →