Deployment Steps
Last updated
Last updated
Open your Azure portal, find and go to the Storage accounts, and click on the + Create button to create a new storage account.
In the basics tab, select your subscription, select your resource group, name your ADLS Gen 2 storage account, select your region, then select performance as Premium, select the account type as Block blobs, and then go to the advanced tab.
In the advanced tab enable the Hierarchical namespace, then click on Review + create and then click on create.
Once our ADLS Gen2 storage account is created, go to the ADLS Gen2 storage, click on containers/folders under the data storage tab, then click on + Create to create a container/folder.
Connect from Azure Databricks to Azure Data Lake Storage Gen2 using OAuth 2.0 with a Microsoft Entra ID service principal.
To use service principals to connect to Azure Data Lake Storage Gen2, an admin user must create a new Microsoft Entra ID (formerly Azure Active Directory) application.
To create a Microsoft Entra ID service principal, follow these instructions:
If you have access to multiple tenants, subscriptions, or directories, click the Directories + subscriptions (directory with filter) icon in the top menu to switch to the directory in which you want to provision the service principal.
Search for and select <Microsoft Entra ID.
In Manage, click App registrations > New registration.
In the Supported account types section, select Accounts in this organizational directory only (Single tenant).
Click Register.
In Manage, click Certificates & secrets.
On the Client secrets tab, click New client secret.
In the Add a client secret pane, for Description, enter a description for the client secret.
For Expires, select an expiry time period for the client secret, and then click Add.
Copy and store the client secretโs Value in a secure place, as this client secret is the password for your application.
Secret ID: 9660b019-c99c-40b9-ac60-e3bfb24015c6
Value: jQA8Q~25zzIGaAX7svVPC9wjKnp-ffgxX3LjNaKy
On the application pageโs Overview page, in the Essentials section, copy the following values:
Application (client) ID
Directory (tenant) ID
You grant access to storage resources by assigning roles to your service principal. In this tutorial, you assign the Storage Blob Data Contributor to the service principal on your Azure Data Lake Storage Gen2 account. You may need to assign other roles depending on specific requirements.
In the Azure portal, go to the Storage accounts service.
Select an Azure storage account to use.
Click Access Control (IAM).
Click + Add and select Add role assignment from the dropdown menu.
Set the Select field to the Microsoft Entra ID application name that you created in step 1 and set Role to Storage Blob Data Contributor.
Click Save.
ou can store the client secret from step 1 in Azure Key Vault.
In the Azure portal, go to the Key vault service.
Select an Azure Key Vault to use.
On the Key Vault settings pages, select Secrets.
Click on + Generate/Import.
In Upload options, select Manual.
For Name, enter a name for the secret. The secret name must be unique within a Key Vault.
For Value, paste the Client Secret that you stored in Step 1.
Click Create.
To reference the client secret stored in an Azure Key Vault, you can create a secret scope backed by Azure Key Vault in Azure Databricks.
Go to https://<databricks-instance>#secrets/createScope
. This URL is case sensitive; scope in createScope
must be uppercase.
Sign in to the