ModelsColumn Configuration

Column Configuration

After writing a model’s SQL query and previewing the results, you configure each output column to tell SignalSmith how to use it. Column configuration defines the semantic meaning of each column — whether it identifies a record, carries data attributes, or serves as a primary key.

Column Types

Every column in a model is assigned one of two types:

Identifier

An identifier column contains values that can be used to match or look up records in a destination system. Examples include:

  • email — Match contacts by email address
  • customer_id — Match records by internal ID
  • phone — Match contacts by phone number
  • external_id — Match by a destination-specific identifier

Identifier columns are used during sync to determine which records in the destination to create, update, or delete. At least one identifier column is typically required for upsert and mirror sync modes.

When multiple identifier columns are present, the sync uses the one mapped to the destination’s primary identifier field (e.g., “Email” in Salesforce, “email” in HubSpot).

Attribute

An attribute column contains data values that describe or enrich a record. Examples include:

  • first_name, last_name — Personal information
  • lifetime_value — Computed metrics
  • signup_date — Timestamps
  • segment, tier — Categorizations
  • city, country — Geographic data

Attribute columns are mapped to fields in the destination during sync. They carry the actual data being activated.

Primary Key

The primary key is the column (or combination of columns) that uniquely identifies each row in the model’s result set. The primary key is critical for:

  • Deduplication — Ensuring each row represents a unique record
  • Change detection — Determining which records have been added, updated, or removed between sync runs
  • Upsert/mirror operations — Matching source records to destination records

Single Column Primary Key

The most common case. A single column uniquely identifies each row:

SELECT
  customer_id,  -- Primary key
  email,
  name,
  lifetime_value
FROM customers

Configure customer_id as the primary key in the column configuration panel.

Composite Primary Key

When no single column is unique, combine multiple columns:

SELECT
  account_id,    -- Part of composite primary key
  contact_id,    -- Part of composite primary key
  email,
  role
FROM account_contacts

Select both account_id and contact_id as primary key columns. Together, they uniquely identify each row.

Choosing a Primary Key

Guidelines for selecting the right primary key:

ScenarioRecommended Primary Key
Customer table with unique IDscustomer_id
User table with unique emailsemail (if guaranteed unique)
Junction/association tableComposite key (e.g., account_id + user_id)
Aggregated metricsThe GROUP BY column(s)
Event data with deduplicationevent_id or composite of user_id + event_timestamp

Validation: SignalSmith warns if the preview results contain duplicate values in the primary key columns, as this indicates the key does not uniquely identify rows.

Data Type Detection

SignalSmith automatically detects the data type of each column from the query preview results:

Detected TypeDescriptionExamples
TextString/character data"john@example.com", "New York"
NumberInteger or decimal values42, 199.99, -3.14
BooleanTrue/false valuestrue, false
DateDate values without time2025-01-15
DatetimeTimestamp values with time2025-01-15T10:30:00Z
JSONStructured JSON data{"key": "value"}

If the auto-detected type is incorrect (e.g., a numeric ID is detected as a number but should be treated as text), you can override the type in the configuration panel.

Configuration Panel

The column configuration panel appears after a successful query preview. It displays a table with one row per output column:

Column NameTypePrimary KeyData Type
customer_idIdentifierYesNumber
emailIdentifierNoText
first_nameAttributeNoText
last_nameAttributeNoText
lifetime_valueAttributeNoNumber
signup_dateAttributeNoDate

Editing Column Configuration

For each column, you can:

  1. Toggle type — Switch between Identifier and Attribute using the dropdown
  2. Set primary key — Check/uncheck the primary key checkbox
  3. Override data type — Change the auto-detected data type if needed

Validation Rules

SignalSmith enforces the following rules:

  • At least one primary key — Every model must have at least one column marked as the primary key
  • Primary keys must be non-nullable — Columns used as primary keys should not contain NULL values (a warning is shown if NULLs are detected in preview)
  • Unique column names — All column names must be unique (use SQL aliases to rename duplicates)
  • At least one identifier — Models used in syncs should have at least one identifier column for record matching

Impact on Syncs

Column configuration directly affects how syncs operate:

Identifier Columns in Sync

When creating a sync, you map model columns to destination fields. Identifier columns are available for mapping to the destination’s identifier fields (e.g., Email, External ID). The destination uses these to determine whether to create a new record or update an existing one.

Primary Key in Sync

The primary key determines how SignalSmith tracks changes between sync runs:

  • New rows — Rows with primary key values not seen in the previous run are treated as inserts
  • Changed rows — Rows with existing primary key values but different attribute values are treated as updates
  • Missing rows — Rows present in the previous run but absent in the current run are treated as deletes (relevant for mirror sync mode)

Attribute Columns in Sync

Attribute columns carry the data payload. During sync, each attribute column is mapped to a specific field in the destination. Unmapped attribute columns are ignored.

Example Configurations

Customer Profiles

SELECT
  customer_id,
  email,
  phone,
  first_name,
  last_name,
  city,
  country,
  lifetime_value,
  segment
FROM customers
ColumnTypePrimary Key
customer_idIdentifierYes
emailIdentifierNo
phoneIdentifierNo
first_nameAttributeNo
last_nameAttributeNo
cityAttributeNo
countryAttributeNo
lifetime_valueAttributeNo
segmentAttributeNo

Event Aggregations

SELECT
  user_id,
  event_type,
  COUNT(*) AS event_count,
  MAX(event_time) AS last_occurrence
FROM events
GROUP BY user_id, event_type
ColumnTypePrimary Key
user_idIdentifierYes
event_typeAttributeYes
event_countAttributeNo
last_occurrenceAttributeNo

Note: Both user_id and event_type form a composite primary key since neither is unique on its own.

Account-Contact Mapping

SELECT
  account_id,
  contact_id,
  email,
  role,
  is_primary
FROM account_contacts
ColumnTypePrimary Key
account_idIdentifierYes
contact_idIdentifierYes
emailIdentifierNo
roleAttributeNo
is_primaryAttributeNo

Column Sensitivity & PII Masking

Columns can also be assigned sensitivity levels that control data visibility and sync behavior. Mark columns containing PII (email, phone, SSN, etc.) as Redacted, Sync Only, or Blocked to enforce masking in previews, suppress value suggestions, and control sync access.

Sensitivity is configured on the entity config page (Schema > Model > Configure). SignalSmith can automatically detect likely PII columns based on column name patterns.

See PII Masking for full details.

Next Steps