In this post, we will walkthrough of the Databricks Notebook. This is the code base area in Databricks.
Overview
The Databricks Notebook is a kind of document which keeps all the commands, visualization in a cell. You can either create a cell for the entire code or can keep an individual line of code in each cell.
Create Notebook
Here, when you start to create a notebook, need to select:
Name: Any name for a notebook
Default Language: Notebook supports Python, Scala, SQL, and R Languages. You can create a notebook in any language. It also provides an option to switch the language for the entire notebook, or for a particular cell.
Cluster: Select the exiting cluster
For switching the language for the entire notebook, click on the language (besides the name of the notebook) and choose the other language. For running a command in a cell in another language using the first line of code in the cell:
%python – For specifying the cell code in Python Language
%scala – For specifying the cell code in Scala Language
%sql – For specifying the cell code in SQL
Tools in Notebook
Once you create a notebook, you will see and perform:
Cluster: Cluster on which Notebook created
File: You can choose the options (New Notebook, Cone, Rename, Move, Move to Trash, Upload Data, Export, Clear Revision History, Change Default Language) to perform on the File level.
Edit: Provide options to performed on cell level. The best part is SQL Formatter.
View: Here, you can choose what you want to see in a notebook. Also, it includes a setting to switch to a theme between Light and Dark.
Permission: It is for setting the permission.
Run All: By clicking on Run All, the entire cells of the Notebook will be executed.
Clear: Clear the result and State from the cache. It is very important, sometimes we face issue with the current session in the notebook, that time you can use this.
Schedule: You can create a job using the schedule option.
Comment: Use to comment on the cell level, can be used for peer review.
Experiment: Use to track ML model training run.
Revision History: Use to keep the versioning of the notebook. You can switch to any old version to see the changes using this.
Wrapping Up
In this post, we have learned about the databricks notebook. It is good to have an understanding of the databricks notebook before jumping directly on the development if you are new.