Requirement When we ingest data from source to Hadoop data lake, we used to add some additional columns with the existing data source. These columns basically help to validate and analyze the data. So, in this post, we will walk through how we can add some additional columns with theRead More →

Requirement Suppose there is a source data which is in JSON format. The requirement is to load JSON data in Hive non-partitioned table using Spark. Let’s break the requirement into two task: Load JSON data in spark data frame and read it Store it in a hive non-partition table ComponentsRead More →