Requirement
In our previous post, we learned about the Temporary view in Databricks. In this post, we are going to learn about Global View in Databricks or can say in Spark. We will also see when to create a temporary view and how to access it.
Solution
For this exercise, we are going to use the below sample dataset:
empno | ename | designation | manager | hire_date | sal | deptno | location |
9369 | SMITH | CLERK | 7902 | 12/17/1980 | 800 | 20 | BANGALORE |
9499 | ALLEN | SALESMAN | 7698 | 2/20/1981 | 1600 | 30 | HYDERABAD |
9521 | WARD | SALESMAN | 7698 | 2/22/1981 | 1250 | 30 | PUNE |
9566 | TURNER | MANAGER | 7839 | 4/2/1981 | 2975 | 20 | MUMBAI |
9654 | MARTIN | SALESMAN | 7698 | 9/28/1981 | 1250 | 30 | CHENNAI |
9369 | SMITH | CLERK | 7902 | 12/17/1980 | 800 | 20 | KOLKATA |
Step 1: Load Data into DataFrame
First of all, we have to read the data from the CSV file. Here is the code for the same:
%scala val file_location = "/FileStore/tables/emp_data1-3.csv" val df = spark.read.format("csv") .option("inferSchema", "true") .option("header", "true") .option("sep", ",") .load(file_location) display(df)
Here, we have loaded the data into the dataframe. Now, we can create a temporary view to refer to this data.
Step 2: Create Global View in Databricks
Whenever we create a global view, it gets stored in the meta store and is hence accessible within as well as outside of the notebook. You can create a global view using the below command:
df.createOrReplaceGlobalTempView("df_globalview")
The function reateOrReplaceGlobalTempView needs to use to create the global view. Here, we have created the global view named df_globalview.
Let’s try to access this global view in SQL using the below query:
%sql select * from df_globalview
It has thrown an exception saying the Table or view was not found. The reason for this error is – we have not mentioned the global view database. Now, access the global view using the global_temp database.
%sql select * from global_temp.df_globalview
Let’s try to access this global view in any other notebook, it will be accessible.
Wrapping Up
The global view is very useful. We can use it for accessing the data from the source without copying the actual data and use it n multiple notebooks.