Requirement In this post, we are going to import data from RDBMS to Hadoop. Here, we have MySQL as an RDBMS database. We will use Sqoop to import data from RDBMS to Hadoop. Components Involved MySQL – For source data HDFS – To store source data in Hadoop Sqoop –Read More →

Requirement In real time scenario, data files contain many records. Also, there may be many data files available. In that case, it’s good to find a suitable approach to find out the output. Here, we want total number of records available in data files. So the requirement is to howRead More →

Requirement Suppose you get data files which are having user’s basic information like first name, last name, designation, city etc. These basic details are separated by ‘,’ delimiter. Now, the requirement has come to find out all the duplicate value of any field of information. So, here the requirement isRead More →

Requirement Suppose you have a file with full of contents. In this file, many words are repeatable. Now the requirement is how to get distinct words from the file using Map Reduce. If you compare with the SQL, then we have to write a map reduce program which is similarRead More →