Requirement In this post, we are going to understand what is hive_default_partition in hive and why it gets created. Components Involved HDFS HIVE Sample Data We will use below sample data for the task. This is the sample data of employee details. Some employees are the member of company’s sportsRead More →

Requirement Suppose we are having a hive partition table. This table is partitioned by the year of joining. Our requirement is to drop multiple partitions in hive. Components Involved Hive HDFS Sample Data Let’s say we are having given sample data: Here, 1 record belongs to 1 partition as weRead More →

Requirement Suppose we are having a text format data file which contains employees basic details. When we load this file in Spark, it returns an RDD. Our requirement is to find the number of partitions which has created just after loading the data file and see what records are storedRead More →