If you have comma separated file and you want to create a table in the hive on top of it (need to load CSV file in hive).
Step 1: Sample CSV File
Create a sample CSV file named as sample_1.csv file.
download from here sample_1
(You can skip this step if you already have a CSV file, just place it into local directory.)
Put content in that file, delimited by a comma (,). If you have created a file in windows then transfer it to your Linux machine via WinSCP.
please refer below screenshot.
I have a local directory named as input_files, so I have placed a sample_1.CSV file in that directory. You can see content of that file using below command
- cat /root/bigdataprogrammers/input_files/sample_1.csv
It will show the file content:-
Step 2: Copy CSV to HDFS
Run Below commands in the shell for initial setup.
First, create a Hdfs directory ld_csv_hv and ip directory inside that using below command.
- hadoop fs -mkdir bdp/ld_csv_hv
- hadoop fs -mkdir bdp/ld_csv_hv/ip
Put the file in Hdfs using below command
- hadoop fs -put /root/bigdataprogrammers/input_files/sample_1.csv bdp/ld_csv_hv/ip/
Check whether file is available in Hdfs or not using below command
- hadoop fs -ls bdp/ld_csv_hv/ip/
NOTE: – For me, the default Hdfs directory is /user/root/
Step 3: Create Hive Table and Load data
Now you have file in Hdfs, you just need to create an external table on top of it.
Use below hive scripts to create an external table csv_table in schema bdp. Run below script in hive CLI.
CREATE SCHEMA IF NOT EXISTS bdp; CREATE EXTERNAL TABLE IF NOT EXISTS bdp.hv_csv_table (id STRING,Code STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE LOCATION 'hdfs://sandbox.hortonworks.com:8020/user/root/bdp/ld_csv_hv/ip';
Step 4: Verify data
Please check whether CSV data is showing in a table or not using below command.
- select * from bdp.hv_csv_table;
CSV is the most used file format. It stores data as comma separated that’s why we have used comma separated in Fields terminated By option while the hive table creation.