Deployment Steps
Last updated
Last updated
Steps 1: From the Azure portal menu, select Create a resource. Then select Networking > Virtual network.
Steps 2: Under Create virtual network, apply the following settings:
Subscription
<Your subscription>
Select the Azure subscription that you want to use.
Resource group
databricks-project
Select Create New and enter a new resource group name for your account.
Name
databricks-vnet
Select a name for your virtual network.
Region
<Select the region that is closest to your users>
Select a geographic location where you can host your virtual network. Use the location that's closest to your users.
Select Next: IP Addresses > and apply the following settings. Then select Review + create.
IPv4 address space
10.26.1.0/24
The virtual network's address range in CIDR notation. The CIDR range must be between /16 and /24
Subnet name
default
Select a name for the default subnet in your virtual network.
Subnet Address range
10.26.1.0/24
The subnet's address range in CIDR notation. It must be contained by the address space of the virtual network. The address range of a subnet which is in use can't be edited.
From the Azure portal menu, select Create a resource. Then select Analytics > Databricks.
Under Azure Databricks Service, apply the following settings:
Workspace name
databricks-workspace
Select a name for your Azure Databricks workspace.
Subscription
<Your subscription>
Select the Azure subscription that you want to use.
Resource group
databricks-project
Select the same resource group you used for the virtual network.
Location
<Select the region that is closest to your users>
Choose the same location as your virtual network.
Pricing Tier
Choose between Standard or Premium.
Once you've finished entering settings on the Basics page, select Next: Networking > and apply the following settings:
Deploy Azure Databricks workspace in your Virtual Network (VNet)
Yes
This setting allows you to deploy an Azure Databricks workspace in your virtual network.
Virtual Network
databricks-quickstart
Select the virtual network you created in the previous section.
Public Subnet Name
Use the default public subnet name.
Public Subnet CIDR Range
10.26.1.0/25
Use a CIDR range up to and including /25.
Private Subnet Name
Use the default private subnet name.
Private Subnet CIDR Range
10.26.1.128/25
Use a CIDR range up to and including /25.
Once the deployment is complete, navigate to the Azure Databricks resource. Notice that virtual network peering is disabled. Also notice the resource group and managed resource group in the overview page.
The managed resource group is not modifiable, and it is not used to create virtual machines. You can only create virtual machines in the resource group you manage.
When a workspace deployment fails, the workspace is still created in a failed state. Delete the failed workspace and create a new workspace that resolves the deployment errors. When you delete the failed workspace, the managed resource group and any successfully deployed resources are also deleted.
Return to your Azure Databricks service and select Launch Workspace on the Overview page.
Select Clusters > + Create Cluster. Then create a cluster name, like databricks-project-cluster, and accept the remaining default settings. Select Create Cluster.
Once the cluster is running, return to the managed resource group in the Azure portal. Notice the new virtual machines, disks, IP Address, and network interfaces. A network interface is created in each of the public and private subnets with IP addresses.
Return to your Azure Databricks workspace and select the cluster you created. Then navigate to the Executors tab on the Spark UI page. Notice that the addresses for the driver and the executors are in the private subnet range. In this example, the driver is 10.26.1.132 and executors are 10.26.1.133 and 10.26.1.134 and Your IP addresses could be different.
After you have finished the article, you can terminate the cluster. To do so, from the Azure Databricks workspace, from the left pane, select Clusters. For the cluster you want to terminate, move the cursor over the ellipsis under Actions column, and select the Terminate icon. This stops the cluster.
If you do not manually terminate the cluster it will automatically stop, provided you selected the Terminate after __ minutes of inactivity checkbox while creating the cluster. In such a case, the cluster automatically stops, if it has been inactive for the specified time.
If you do not wish to reuse the cluster, you can delete the resource group you created in the Azure portal.
Creating a private link for DBFS root storage
For more information on pricing tiers, see the .
The term DBFS comes from Databricks File System