Import data from MySQL to HBase using Sqoop

Requirement

Suppose, we have data in a table EMPLOYEE in MySQL database. We want to ingest same data in NoSQL HBase table. So, the requirement is to import data from MySQL to HBase using Sqoop.

Components Involved

Following components are involved in this requirement

  • MySQL – Source for the Ingestion.
  • HBase – Target for the Ingestion.
  • Sqoop – Using to import data from MySQL to HBase.

Sample Data

Below are some sample records for this requirement:

sample-data
 
  1. id,first_name,last_name,gender,designation,city,country
  2. 1,Thayne,Mocher,Male,Administrative Officer,Port Colborne,Canada
  3. 2,Shelly,Bulfoot,Female,Analyst Programmer,Bacong,Philippines
  4. 3,Hercule,Chorlton,Male,Operator,Al Mazzūnah,Tunisia
  5. 4,Thorstein,Epton,Male,Administrative Officer,Tayirove,Ukraine
  6. 5,Madelena,Savin,Female,Help Desk Technician,Jinjiang,China
  7. 6,Adeline,Imesson,Female,Legal Assistant,Fort Beaufort,South Africa
  8. 7,Ambros,Richards,Male,Chemical Engineer,Dubiecko,Poland
  9. 8,Lilas,Harrowing,Female,Assistant Media Planner,Guayatá,Colombia
  10. 9,Freida,Leivers,Female,Legal Assistant,Bangus Kulon,Indonesia
  11. 10,Celie,Dolligon,Female,Data Coordiator,Paraty,Brazil
  12. 11,Berkley,Orteaux,Male,Assistant Professor,Zinder,Niger
  13. 12,Gilburt,Minot,Male,Marketing Assistant,Hanyuan,China
  14. 13,Blaine,Treverton,Male,Research Associate,Yuankeng,China
  15. 14,Benjamen,Dodd,Male,Assistant Professor,Beberon,Philippines
  16. 15,Nikos,Worpole,Male,Human Resources Assistant II,Esmeralda,Cuba

Solutions

We have already done data preparation in MySQL table in this post. You can look into it for MySQL table data. Let’s start with creating the Sqoop job.

Step 1: Create Sqoop Job

We have to create a Sqoop job which will include all the MySQL configuration, Source table name,  target table name etc. Find below sqoop job for ingesting data from MySQL to HBase.

sqoop-job
 
  1. sqoop import \
  2.       --connect jdbc:mysql://localhost:3306/db_bdp \
  3.       --driver com.mysql.jdbc.Driver \
  4.       --username root \
  5.       --table employee \
  6.       --hbase-create-table \
  7.       --hbase-table employee_details \
  8.       --column-family employees \
  9.       --hbase-row-key id -m 1

hbase-create-table – it will create HBase table if not exist

hbase-table – take table name in which data will get store

column-family – it takes column family value. We have provided value employees.

hbase-row-key – Use MySQL column name for HBase row key

Step 2: Job Execution

In this step, we will execute the sqoop job which we have created in the previous step. We have mentioned the number of mapper to 1.

Step 3: Data Validation

In this step, we can validate the records between MySQL and HBase. To make simple validation, checking an only number of records. In MySQL,  the table EMPLOYEE has 1000 records and same has been ingested into HBase table.

Wrapping Up

In this post, we have gone through a requirement to ingest data from MySQL to HBase table. We created the Sqoop job which took multiple arguments. We understood all the arguments individually.

Load CSV file into hive AVRO table

Requirement You have comma separated(CSV) file and you want to create Avro table in hive on top of it, then ...
Read More

Load CSV file into hive PARQUET table

Requirement You have comma separated(CSV) file and you want to create Parquet table in hive on top of it, then ...
Read More

Hive Most Asked Interview Questions With Answers – Part II

What is bucketing and what is the use of it? Answer: Bucket is an optimisation technique which is used to ...
Read More
/ hive, hive interview, interview-qa

Spark Interview Questions Part-1

Suppose you have a spark dataframe which contains millions of records. You need to perform multiple actions on it. How ...
Read More

Leave a Reply