Column Configuration

After writing a model’s SQL query and previewing the results, you configure each output column to tell SignalSmith how to use it. Column configuration defines the semantic meaning of each column — whether it identifies a record, carries data attributes, or serves as a primary key.

Column Types

Every column in a model is assigned one of two types:

Identifier

An identifier column contains values that can be used to match or look up records in a destination system. Examples include:

email — Match contacts by email address
customer_id — Match records by internal ID
phone — Match contacts by phone number
external_id — Match by a destination-specific identifier

Identifier columns are used during sync to determine which records in the destination to create, update, or delete. At least one identifier column is typically required for upsert and mirror sync modes.

When multiple identifier columns are present, the sync uses the one mapped to the destination’s primary identifier field (e.g., “Email” in Salesforce, “email” in HubSpot).

Attribute

An attribute column contains data values that describe or enrich a record. Examples include:

first_name, last_name — Personal information
lifetime_value — Computed metrics
signup_date — Timestamps
segment, tier — Categorizations
city, country — Geographic data

Attribute columns are mapped to fields in the destination during sync. They carry the actual data being activated.

Primary Key

The primary key is the column (or combination of columns) that uniquely identifies each row in the model’s result set. The primary key is critical for:

Deduplication — Ensuring each row represents a unique record
Change detection — Determining which records have been added, updated, or removed between sync runs
Upsert/mirror operations — Matching source records to destination records

Single Column Primary Key

The most common case. A single column uniquely identifies each row:

SELECT
  customer_id,  -- Primary key
  email,
  name,
  lifetime_value
FROM customers

Configure customer_id as the primary key in the column configuration panel.

Composite Primary Key

When no single column is unique, combine multiple columns:

SELECT
  account_id,    -- Part of composite primary key
  contact_id,    -- Part of composite primary key
  email,
  role
FROM account_contacts

Select both account_id and contact_id as primary key columns. Together, they uniquely identify each row.

Choosing a Primary Key

Guidelines for selecting the right primary key:

Scenario	Recommended Primary Key
Customer table with unique IDs	`customer_id`
User table with unique emails	`email` (if guaranteed unique)
Junction/association table	Composite key (e.g., `account_id` + `user_id`)
Aggregated metrics	The `GROUP BY` column(s)
Event data with deduplication	`event_id` or composite of `user_id` + `event_timestamp`

Validation: SignalSmith warns if the preview results contain duplicate values in the primary key columns, as this indicates the key does not uniquely identify rows.

Data Type Detection

SignalSmith automatically detects the data type of each column from the query preview results:

Detected Type	Description	Examples
Text	String/character data	`"john@example.com"`, `"New York"`
Number	Integer or decimal values	`42`, `199.99`, `-3.14`
Boolean	True/false values	`true`, `false`
Date	Date values without time	`2025-01-15`
Datetime	Timestamp values with time	`2025-01-15T10:30:00Z`
JSON	Structured JSON data	`{"key": "value"}`

If the auto-detected type is incorrect (e.g., a numeric ID is detected as a number but should be treated as text), you can override the type in the configuration panel.

Configuration Panel

The column configuration panel appears after a successful query preview. It displays a table with one row per output column:

Column Name	Type	Primary Key	Data Type
`customer_id`	Identifier	Yes	Number
`email`	Identifier	No	Text
`first_name`	Attribute	No	Text
`last_name`	Attribute	No	Text
`lifetime_value`	Attribute	No	Number
`signup_date`	Attribute	No	Date

Editing Column Configuration

For each column, you can:

Toggle type — Switch between Identifier and Attribute using the dropdown
Set primary key — Check/uncheck the primary key checkbox
Override data type — Change the auto-detected data type if needed

Validation Rules

SignalSmith enforces the following rules:

At least one primary key — Every model must have at least one column marked as the primary key
Primary keys must be non-nullable — Columns used as primary keys should not contain NULL values (a warning is shown if NULLs are detected in preview)
Unique column names — All column names must be unique (use SQL aliases to rename duplicates)
At least one identifier — Models used in syncs should have at least one identifier column for record matching

Impact on Syncs

Column configuration directly affects how syncs operate:

Identifier Columns in Sync

When creating a sync, you map model columns to destination fields. Identifier columns are available for mapping to the destination’s identifier fields (e.g., Email, External ID). The destination uses these to determine whether to create a new record or update an existing one.

Primary Key in Sync

The primary key determines how SignalSmith tracks changes between sync runs:

New rows — Rows with primary key values not seen in the previous run are treated as inserts
Changed rows — Rows with existing primary key values but different attribute values are treated as updates
Missing rows — Rows present in the previous run but absent in the current run are treated as deletes (relevant for mirror sync mode)

Attribute Columns in Sync

Attribute columns carry the data payload. During sync, each attribute column is mapped to a specific field in the destination. Unmapped attribute columns are ignored.

Example Configurations

Customer Profiles

SELECT
  customer_id,
  email,
  phone,
  first_name,
  last_name,
  city,
  country,
  lifetime_value,
  segment
FROM customers

Column	Type	Primary Key
`customer_id`	Identifier	Yes
`email`	Identifier	No
`phone`	Identifier	No
`first_name`	Attribute	No
`last_name`	Attribute	No
`city`	Attribute	No
`country`	Attribute	No
`lifetime_value`	Attribute	No
`segment`	Attribute	No

Event Aggregations

SELECT
  user_id,
  event_type,
  COUNT(*) AS event_count,
  MAX(event_time) AS last_occurrence
FROM events
GROUP BY user_id, event_type

Column	Type	Primary Key
`user_id`	Identifier	Yes
`event_type`	Attribute	Yes
`event_count`	Attribute	No
`last_occurrence`	Attribute	No

Note: Both user_id and event_type form a composite primary key since neither is unique on its own.

Account-Contact Mapping

SELECT
  account_id,
  contact_id,
  email,
  role,
  is_primary
FROM account_contacts

Column	Type	Primary Key
`account_id`	Identifier	Yes
`contact_id`	Identifier	Yes
`email`	Identifier	No
`role`	Attribute	No
`is_primary`	Attribute	No

Column Sensitivity & PII Masking

Columns can also be assigned sensitivity levels that control data visibility and sync behavior. Mark columns containing PII (email, phone, SSN, etc.) as Redacted, Sync Only, or Blocked to enforce masking in previews, suppress value suggestions, and control sync access.

Sensitivity is configured on the entity config page (Schema > Model > Configure). SignalSmith can automatically detect likely PII columns based on column name patterns.

See PII Masking for full details.

Next Steps

Create a sync using your configured model
Map fields to destination fields
Set up the schema to define entity relationships
Configure PII masking for sensitive columns

SQL Editor Overview