Databricks

Write data back to a Databricks table. Use SignalSmith to materialize model results, audience memberships, or enriched data into Databricks Delta tables.

Prerequisites

  • A Databricks workspace
  • A SQL warehouse or cluster with the SQL Statement Execution API enabled
  • A personal access token or OAuth service principal credentials

Authentication

Databricks supports two authentication methods:

Personal Access Token

  1. In Databricks, go to User Settings > Access Tokens
  2. Generate a new personal access token (starts with dapi)
  3. Paste it into the Access Token field in SignalSmith

OAuth (Service Principal)

  1. Create a service principal in Databricks
  2. Generate a Client ID and Client Secret
  3. Enter both values in SignalSmith

Configuration

FieldTypeRequiredDescription
Server HostnameTextYesDatabricks workspace hostname (e.g., abc-defgh.cloud.databricks.com)
HTTP PathTextYesSQL warehouse HTTP path (e.g., /sql/1.0/warehouses/abc123)
CatalogTextNoUnity Catalog name (optional). Default: hive_metastore
SchemaTextNoDefault schema (optional). Default: default

Target Settings

FieldTypeRequiredDescription
Target CatalogTextNoTarget catalog (optional)
Target SchemaTextYesTarget schema (e.g., default)
Target TableTextYesTarget table name

Supported Operations

Sync Modes: Upsert, Insert, Update, Mirror

Audience Sync Modes: Add, Remove, Mirror, Upsert

Features

  • Field Mapping: Yes
  • Schema Introspection: Yes — SignalSmith reads table metadata via the SQL Statement Execution API

How It Works

SignalSmith uses the SQL Statement Execution API for all operations:

  1. Data is loaded using INSERT INTO VALUES statements
  2. MERGE INTO is used for upsert operations
  3. Delta Lake manages versioning and ACID transactions

Troubleshooting

Connection failed

Verify the Server Hostname and HTTP Path. The HTTP Path is found in the SQL Warehouse details page under Connection details.

Token expired

Personal access tokens can expire. Generate a new token in Databricks User Settings.

Table not found

Verify the catalog, schema, and table name. For Unity Catalog, ensure the catalog is accessible to the authenticated user.

SQL warehouse not running

The SQL warehouse must be running to execute queries. Check the warehouse status in the Databricks SQL Warehouses page.