Skip to main content
Skip table of contents

Configure Databricks

Databricks can be used in two ways in our platform:

  • As an underlying data platform and cloud solution to store assets, share data through Delta Sharing and provide processing/compute for Export, Spaces and Query features.

  • Currently only Databricks on Azure is supported with Harbr.

  • As a standalone connector to load asset data and to export it out of the Harbr platform. Supported Harbr platform versions are:

    • Harbr on Microsoft Azure

    • Harbr on Amazon Web Services (AWS)

    • Harbr on Google Cloud Platform (GCP)


Prerequisites

To connect Harbr and Databricks you will need:

  • a Databricks account, that has the following permissions:

Permission

Status

Clusters

  • CAN_MANAGE

Jobs

  • CAN_MANAGE

Cluster Policies

  • CAN_MANAGE

DLT

  • CAN_MANAGE

Directories

  • CAN_MANAGE

Notebooks

  • CAN_MANAGE

ML Models

Not required

ML Experiments

Not required

Dashboard

Not required

Queries

  • CAN_MANAGE

Alerts

  • CAN_MANAGE

Secrets

  • CAN_MANAGE

Token

  • CAN_MANAGE

SQL Warehouse

  • CAN_READ

Repos

  • CAN_MANAGE

Pools

  • CAN_MANAGE

  • Note that the above requirement may change depending on what Harbr features you intend to use. We recommend you consult with your Harbr contact before the configuration.

  • a Harbr account with the appropriate roles:

    • Default user

    • Organisation Admin

    • Technician

  • Unity Catalog enabled on your Databricks Workspace. Unity Catalog is a unified governance solution for all data and AI assets including files, tables, machine learning models and dashboards.

Configure Databricks on Azure destination

Create a connector

  1. Go to Manage > Connectors > Create a new connector

  2. Choose Type > Databricks.

  3. Type in:

  4. The details will be validated and we will display what capabilities the connector can perform  in the platform with the given permissions

Capabilities

What it allows to do

catalog

  • list objects in a catalog

    • Files in a filesystem

    • Tables in a database / catalog

    • Objects in a data catalog

    • Models in a model store

  • register:

    • manage: create, update, delete, get info about assets within a catalog

  • get metadata:

    • get schema, metadata, data quality metrics, samples

access

  • creating and managing delta shares

  • credentials

    • get credentials

    • create a new identity

jobs

  • manage spark job: create, get status, update, delete, get info

clusters

  • manage: create, get status, update, delete, get info

resources

  • manage object in the catalog

  • manage an object in the cloud

Configure Organisation

Each organisation can be configured to use different a Databricks warehouse to store and process the data. To do this:

  1. Go to Organisation Administration

  2. Go to Metadata tab

  3. Add an entry with the following key/value pair.

  4. There are two main values to configure:

    1. upload_platform: Azure connector that will be using for file processing

    2. processing_platform: Databricks connector that will be used

  5. You can get the connector unique identifier from the url:

Screenshot 2024-09-11 at 08.11.10.png

key: harbr.user_defaults

value:

CODE
{
	"consumption": {
		"catalogs": [
			{
				"name": "",
				"id": "",
				"connector_id": "",
				"databricks_catalog": "assets",
				"databricks_schema": "managed",
				"databricks_table_name": {
					"naming_scheme": "PESSIMISTIC"
				},
				"default": true,
				"access": {
					"share": {
						"default_ttl_minutes": "1200"
					},
					"query": {
						"default_llm": "",
						"default_engine": ""
					},
					"iam": {
						
					}
				}
			}
		]
	},
	"upload_platform":
	{
	"connector_id":"yourconnectorid"
	},
	"processing_platform": 
	{
		"connector_id": "yourconnectorid",
		"default_job_cluster_definition": {
			
		}
	}
}

Your organisation is now setup to use Databricks.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.