Identity ResolutionOverview

Identity Resolution

Identity resolution is SignalSmith’s module for unifying customer records across data sources into a single, coherent profile. When the same person appears in multiple tables, systems, or channels — each with different identifiers — identity resolution links those records together and produces a golden record: the unified customer profile.

Why Identity Resolution Matters

In most organizations, customer data is fragmented:

  • A user signs up on your website with an email address
  • The same person downloads your mobile app and gets a device ID
  • They call customer support and are identified by phone number
  • They make a purchase in-store and use a loyalty card number

Without identity resolution, these appear as four separate customers. Marketing messages are duplicated, analytics are inaccurate, and the customer experience is disjointed.

Identity resolution connects these records by finding shared identifiers — the same email address, the same phone number, overlapping cookies — and merging them into a single unified profile.

How It Works

SignalSmith uses a connected-components graph algorithm to resolve identities. The process works in three phases:

Phase 1: Build the Identity Graph

Each customer record is a node in the graph. Each shared identifier creates an edge between nodes. For example:

Record A (email: alice@co.com, phone: +1-555-0100)
Record B (email: alice@co.com, device: abc123)
Record C (phone: +1-555-0100, loyalty: LY-9876)
Record D (device: abc123)

The graph connects A-B (shared email), A-C (shared phone), and B-D (shared device ID).

Phase 2: Find Connected Components

The algorithm finds all groups of records that are transitively connected. In the example above, A, B, C, and D all end up in the same cluster — even though C and D share no direct identifier — because they are linked through A and B.

Cluster 1: {A, B, C, D} — all represent the same person

This transitive closure is what makes identity resolution powerful: it can link records that don’t share any identifier directly, as long as a chain of shared identifiers exists.

Phase 3: Produce Golden Records

For each cluster, SignalSmith produces a golden record — a unified profile that combines the best attributes from all contributing records using configurable survivorship rules (most recent, most frequent, source priority, etc.).

Key Concepts

Identity Graph

The identity graph is the data structure that represents relationships between customer records and their identifiers. You define the graph by specifying:

  • Entity types to include (which source tables contain customer records)
  • Identifier families (email, phone, device ID, etc.) and their variants
  • Merge rules (which identifiers can link records together)
  • Limit rules (constraints to prevent bad merges)

Cluster

A cluster is a group of records that have been resolved to the same person. Each cluster gets a unique cluster ID that serves as the unified identifier across the platform.

Golden Record

A golden record is the output of identity resolution — one unified profile per cluster containing the “best” value for each attribute, determined by survivorship rules.

The Resolution Pipeline

Identity resolution pipeline
  1. Configure — Define entity types, identifier families, merge rules, and limit rules through a 5-step wizard
  2. Build — Extract identifiers from source tables and construct the graph
  3. Resolve — Run the connected-components algorithm to find clusters
  4. Produce — Generate golden records with survivorship rules and materialize to your warehouse

The entire pipeline runs warehouse-native — all computation happens as SQL executed against your data warehouse.

What You Can Do

FeatureDescription
Create an Identity Graph5-step wizard for defining resolution rules
Identifier FamiliesDefine email, phone, device, and custom identifier types
Merge RulesControl which identifiers can link records
Limit RulesPrevent over-merging with cluster size and identifier limits
Run ResolutionExecute full or incremental resolution
Golden RecordsConfigure survivorship strategies for unified profiles
Profile ExplorerSearch and view resolved profiles

Identity Resolution and Segment

Identity resolution integrates with Segment to enable audience building on unified profiles:

  • Traits can be computed on resolved entities (golden records), combining data from all linked source records
  • Audiences can segment customers based on their unified profile attributes
  • Audience syncs can use resolved identifiers (e.g., send to the most recent email address in the cluster)

Next Steps