Glossary
Key terms and concepts used in SignalSmith and the CDP domain, sorted alphabetically.
Activation — The process of sending audience data or customer attributes from your warehouse to a destination platform for marketing, advertising, analytics, or operational use. Activation is the output stage of the CDP pipeline.
API Key — A secret token used for programmatic access to the SignalSmith REST API and MCP server. API keys are workspace-scoped, hashed with bcrypt at rest, and support rotation and expiration.
Audience — A defined segment of customers based on trait conditions and attribute filters. Audiences are created in Segment using the filter builder or NL Audience Builder, and activated via audience syncs to destinations.
Audience Sync — A connection that sends audience membership and mapped fields to a destination. Supports three modes: mirror (full sync with adds, updates, and removes), additive (only adds new members), and subtractive (only removes exited members).
Branch — A journey tile that splits the customer flow into two or more paths based on conditions. Branch types include trait conditions, event checks, percentage splits, and time-based conditions.
CDP — Customer Data Platform. A system that unifies customer data from multiple sources, builds a single view of the customer, and enables activation to marketing, sales, and analytics tools. SignalSmith is a composable, warehouse-native CDP.
Connected Components — The graph algorithm used in identity resolution to group related customer records into unified profiles. Records sharing a common identifier (directly or transitively) are placed in the same component, forming a cluster that becomes a golden record.
Consent Category — A classification (e.g., analytics, marketing, advertising, functional) used in event consent management to control which events are forwarded to which destinations based on the user’s consent state.
Contract — See Event Contract.
Loader — A connector that pulls data from SaaS applications (Salesforce, HubSpot, Stripe, Zendesk, etc.) into your data warehouse on a configurable schedule, making external application data available for modeling and audience building.
Destination — An external tool or platform where SignalSmith sends data. Destinations span CRM, advertising, marketing, analytics, warehouses, cloud storage, and streaming categories. SignalSmith supports 50+ destinations.
Destination Filter — A governance policy that controls which data can flow to specific destinations. Filters can block certain audiences from syncing to a destination, enforce field restrictions, or apply rate limits to protect destination APIs.
Entity Type — A core object in your data schema representing a business entity (e.g., User, Account, Product, Order). Entity types define the structure of your customer data model and are the foundation for traits, audiences, and relationships.
Event — A timestamped action or occurrence sent to SignalSmith via the events API. Events follow a Segment-compatible format with four types: track (actions), identify (user traits), page (page views), and group (group membership).
Event Contract — A schema definition that enforces required properties and data types on incoming events. When an event violates a contract, the configured violation policy determines whether to allow, drop the offending property, or reject the event entirely.
Event Forwarding — Real-time routing of events from SignalSmith to downstream analytics and marketing tools. Forwarding rules define which events go to which destinations, with optional property mapping and consent filtering.
Field Mapping — The configuration that maps columns from a SignalSmith model or audience to fields in a destination. Field mapping defines how source data translates to the destination’s schema, including type conversions and field renaming.
Filter Builder — The visual interface in Segment for defining audience conditions using AND/OR logic, comparison operators (equals, greater than, contains, is null, etc.), and nested groups. Supports trait values, entity attributes, and cross-entity conditions via relationships.
Formula Trait — A computed trait that combines existing traits using arithmetic or logical expressions. Formula traits reference other traits by name and support operations like lifetime_value / total_orders or days_since_last_order > 60.
Golden Record — The unified customer profile output from identity resolution, containing the best-known values for each attribute across all matched source records. Survivorship rules determine which source value wins when multiple records contribute the same attribute.
Govern — The set of features controlling access, data flow, and compliance: RBAC (who can do what), destination filters (where data can flow), access filters (which data is visible), organizations (multi-workspace management), and AI guardrails (safety boundaries for AI operations).
Group — A collection of workspace members used for easier RBAC and access filter management. Members inherit group-level permissions and data access restrictions, simplifying team-based access control.
Identifier Family — A category of customer identifier used in identity resolution (e.g., email, phone, device_id, customer_id). Each family can have variants (e.g., personal email, work email) and configurable matching rules.
Identity Graph — A configuration that defines how customer records are linked across data sources during identity resolution. Contains identifier families, merge rules, limit rules, and survivorship rules that control the unification algorithm.
Identity Resolution — The process of unifying customer records from multiple data sources into a single golden record using shared identifiers and a connected-components algorithm. Runs as SQL in your warehouse, supporting both full and incremental resolution modes.
Incremental Resolution — An identity resolution mode that only processes new or changed records since the last run, rather than recomputing the entire identity graph from scratch. Significantly faster for ongoing resolution after the initial full run.
Journey — A multi-step, multi-channel workflow that customers move through based on triggers, conditions, and time delays. Built using a visual canvas editor with tiles for entry, wait, branch, action, and exit. Journeys orchestrate automated customer experiences across destinations.
Limit Rule — A constraint in identity resolution that prevents super-clusters (unreasonably large merged profiles) by capping the maximum cluster size or the number of identifiers per family within a single cluster.
Loader — See Loader.
Merge Rule — A rule in identity resolution that defines which identifier matches can merge two records together. Merge rules support deterministic matching (exact value match on a specified identifier family) and can be configured to require matching on multiple identifier families.
Mirror Mode — A sync mode that keeps the destination in perfect sync with the source — creating new records, updating changed records, and deleting records that no longer exist in the source. The recommended mode for most audience syncs where the destination should always reflect current audience membership.
MCP Server — Model Context Protocol server that exposes SignalSmith’s capabilities as AI-callable tools. Enables integration with Claude Desktop, Cursor, and any MCP-compatible client. Supports schema introspection, audience operations, sync management, and write operations with guardrail enforcement.
Model — A SQL query that defines a view of data from a connected source. Models shape raw warehouse tables into the format needed for syncing, trait computation, audience building, and schema mapping. Models are the bridge between raw data and SignalSmith features.
Organization — A multi-workspace container for centralized billing, member management, and SSO configuration. Organizations allow companies to manage multiple workspaces (e.g., production, staging, different business units) under a single administrative umbrella.
Permission — A fine-grained access right in the RBAC system. SignalSmith has 42 permissions across 14 resource categories (sources, models, audiences, syncs, destinations, traits, journeys, events, governance, settings, and more). Permissions are grouped into roles.
Priority List — An ordered list of audiences used to resolve membership overlap. When a customer qualifies for multiple audiences in a priority list, they are assigned to the highest-priority audience only. Useful for tiered marketing programs and mutually exclusive segmentation.
Profile Explorer — A UI for searching and viewing unified customer profiles from identity resolution. Search by any identifier (email, phone, customer ID) to see the golden record, linked source records, merge history, trait values, and audience memberships.
RBAC — Role-Based Access Control. A permission model based on built-in roles (Owner, Admin, Member) with additive permissions. RBAC controls who can view, create, edit, and delete resources across the platform. Applied at the workspace level.
Relationship — A defined connection between two entity types (e.g., User belongs_to Account, User purchased Product). Relationships enable cross-entity audience building (e.g., “Users whose Account has over 100 employees”) and cross-entity trait computation.
Role — A named set of permissions assigned to workspace members. Built-in roles include Owner (full access), Admin (manage resources and members), and Member (operational access). Roles determine the scope of what each user can see and do.
Schema — The structural definition of your customer data, including entity types, their attributes (identifiers and properties), and relationships between entity types. The schema gives SignalSmith a unified understanding of your data model across all connected sources.
Warehouse — A connection to a data warehouse (Snowflake, BigQuery, Databricks). Warehouses are read-only connections from which models query data. SignalSmith executes SQL against warehouses but does not write to warehouse tables (except for its own operational schemas).
Split — A division of an audience into random percentage-based groups for A/B testing. Uses deterministic hashing for consistent assignment — the same customer always falls into the same split group, ensuring stable test populations across evaluations.
Access Filter — A row-level access control filter applied to groups. When an access filter is active for a group, all queries by group members are automatically scoped to matching rows only. Used for team-based data isolation (e.g., regional teams see only their region’s data).
Sync — A connection between a model and a destination that defines field mapping, sync mode, schedule, and error handling. Syncs move data from your warehouse to external tools on a configured cadence.
Sync Mode — The strategy for how data is written to a destination: upsert (create or update based on a matching identifier), mirror (full sync with creates, updates, and deletes), append (add only, no deduplication), or insert (create only, fail on duplicates).
Sync Run — A single execution of a sync. Each run produces metrics including rows synced (added, updated, removed), errors encountered, duration, throughput (rows/second), and completion status. Sync runs are logged in the audit trail and visible in the Insights dashboards.
Tile — A building block in the journey canvas. Tile types include: entry (defines who enters), wait (pause for a duration), branch (split based on conditions), action (send a message or trigger an API call), update profile (modify a trait or attribute), webhook (call an external URL), sub-journey (nest another journey), and exit (end the journey).
Trait — A computed attribute about a customer entity, evaluated as SQL against your warehouse. Three types: SQL traits (custom SQL queries), aggregation traits (count, sum, average, min, max over a related table), and formula traits (combine existing traits with expressions). Traits are the building blocks for audience conditions.
Transformation — A processing step applied to events between receipt and forwarding. Transformations can rename properties, add computed properties, remove sensitive fields, filter events, or modify values before the event reaches its destination.
Upsert — A sync mode that creates new records when the matching identifier doesn’t exist in the destination, and updates existing records when it does. Does not delete records that are removed from the source. The most common sync mode for model-to-destination syncs.
Variant — A sub-type within an identifier family in identity resolution. For example, the “email” family might have “personal” and “work” variants. Variants allow more granular control over which identifier matches can trigger a merge.
Warehouse-Native — SignalSmith’s core architecture principle where all computation (trait evaluation, audience building, identity resolution, model execution) runs as SQL queries directly in the customer’s data warehouse. Customer data never leaves the warehouse except during activation to destinations.
Workspace — The top-level organizational unit in SignalSmith. A workspace contains all resources (sources, models, destinations, syncs, audiences, traits, journeys, events, governance policies) and manages member access via RBAC. Each workspace connects to one primary data warehouse.
Write Key — A secret token used to authenticate event sources when sending data to the SignalSmith events API. Each write key is associated with a named source (e.g., “Production Web App”) for tracking and independent revocation.