May 2017

Requirement Assume you have the XML file which is transferred to your local system by some other application. The file has customer’s data and it is needed to process this data using Pig. But the challenge here is that file is not simple text or CSV file, it is theRead More →

Requirement Assume that you want to load file (which have a pipe(|) separated values) in pig and store the output delimited by a comma (‘,’).  Solution Follow the below steps: Step 1: Sample file Create a sample file named as sample_1.txt file. If you have any sample data with you,Read More →

Requirement Assume that you want to load TSV(tab separated values) file in pig and store the output delimited by a pipe (‘|’). Solution Follow the below steps: Step 1: Sample TSV file Create a sample TSV file named as sample_1.tsv file. If you have any sample data with you, thenRead More →

Requirement Suppose the source data is in a file. The file format is a text format. The requirement is to load the text file into a hive table using Spark. In addition to this, read the data from the hive table using Spark. Therefore, let’s break the task into sub-tasks:Read More →