Identity Resolution
Identity resolution is SignalSmith’s module for unifying customer records across data sources into a single, coherent profile. When the same person appears in multiple tables, systems, or channels — each with different identifiers — identity resolution links those records together and produces a golden record: the unified customer profile.
Why Identity Resolution Matters
In most organizations, customer data is fragmented:
- A user signs up on your website with an email address
- The same person downloads your mobile app and gets a device ID
- They call customer support and are identified by phone number
- They make a purchase in-store and use a loyalty card number
Without identity resolution, these appear as four separate customers. Marketing messages are duplicated, analytics are inaccurate, and the customer experience is disjointed.
Identity resolution connects these records by finding shared identifiers — the same email address, the same phone number, overlapping cookies — and merging them into a single unified profile.
How It Works
SignalSmith uses a connected-components graph algorithm to resolve identities. The process works in three phases:
Phase 1: Build the Identity Graph
Each customer record is a node in the graph. Each shared identifier creates an edge between nodes. For example:
Record A (email: alice@co.com, phone: +1-555-0100)
Record B (email: alice@co.com, device: abc123)
Record C (phone: +1-555-0100, loyalty: LY-9876)
Record D (device: abc123)The graph connects A-B (shared email), A-C (shared phone), and B-D (shared device ID).
Phase 2: Find Connected Components
The algorithm finds all groups of records that are transitively connected. In the example above, A, B, C, and D all end up in the same cluster — even though C and D share no direct identifier — because they are linked through A and B.
Cluster 1: {A, B, C, D} — all represent the same personThis transitive closure is what makes identity resolution powerful: it can link records that don’t share any identifier directly, as long as a chain of shared identifiers exists.
Phase 3: Produce Golden Records
For each cluster, SignalSmith produces a golden record — a unified profile that combines the best attributes from all contributing records using configurable survivorship rules (most recent, most frequent, source priority, etc.).
Key Concepts
Identity Graph
The identity graph is the data structure that represents relationships between customer records and their identifiers. You define the graph by specifying:
- Entity types to include (which source tables contain customer records)
- Identifier families (email, phone, device ID, etc.) and their variants
- Merge rules (which identifiers can link records together)
- Limit rules (constraints to prevent bad merges)
Cluster
A cluster is a group of records that have been resolved to the same person. Each cluster gets a unique cluster ID that serves as the unified identifier across the platform.
Golden Record
A golden record is the output of identity resolution — one unified profile per cluster containing the “best” value for each attribute, determined by survivorship rules.
The Resolution Pipeline
- Configure — Define entity types, identifier families, merge rules, and limit rules through a 5-step wizard
- Build — Extract identifiers from source tables and construct the graph
- Resolve — Run the connected-components algorithm to find clusters
- Produce — Generate golden records with survivorship rules and materialize to your warehouse
The entire pipeline runs warehouse-native — all computation happens as SQL executed against your data warehouse.
What You Can Do
| Feature | Description |
|---|---|
| Create an Identity Graph | 5-step wizard for defining resolution rules |
| Identifier Families | Define email, phone, device, and custom identifier types |
| Merge Rules | Control which identifiers can link records |
| Limit Rules | Prevent over-merging with cluster size and identifier limits |
| Run Resolution | Execute full or incremental resolution |
| Golden Records | Configure survivorship strategies for unified profiles |
| Profile Explorer | Search and view resolved profiles |
Identity Resolution and Segment
Identity resolution integrates with Segment to enable audience building on unified profiles:
- Traits can be computed on resolved entities (golden records), combining data from all linked source records
- Audiences can segment customers based on their unified profile attributes
- Audience syncs can use resolved identifiers (e.g., send to the most recent email address in the cluster)
Next Steps
- Creating an Identity Graph — Get started with the 5-step setup wizard
- Core Concepts — How identity resolution fits into the broader SignalSmith platform