Core Concepts
Understanding the building blocks of SignalSmith and how they fit together.
The Data Flow
SignalSmith follows a warehouse-first architecture where your data warehouse is the single source of truth. Data flows through the platform in a clear pipeline:
Key Concepts
Warehouses
A source is a connection to your data warehouse. SignalSmith reads data from sources but never writes to them (except for materialized models). Supported warehouses include Snowflake, BigQuery, and Databricks.
Models
A model is a SQL query that defines a view of your data. Models shape raw warehouse tables into the format you need for syncing. Each model is tied to a source and produces a result set with typed columns.
Schema & Entity Types
The schema defines the structure of your customer data. Entity types (like “User”, “Account”, “Product”) represent the core objects in your data model. Relationships connect entity types to each other (e.g., a User belongs to an Account).
Destinations
A destination is an external tool where you send data. SignalSmith supports 50+ destinations across CRM, advertising, marketing, analytics, warehouses, cloud storage, and streaming categories.
Syncs
A sync connects a model to a destination. It defines the field mapping, sync mode (upsert, mirror, append), and schedule. Sync runs track each execution with row counts and error details.
Traits
A trait is a computed attribute about a customer entity. Traits can be SQL-based (custom queries), aggregation-based (count, sum, average), or formula-based (combine other traits). Traits are evaluated on a schedule and stored for audience building.
Audiences
An audience is a segment of customers defined by conditions on traits and attributes. The visual filter builder lets you combine conditions with AND/OR logic. Audiences support size estimation and preview before activation.
Audience Syncs
An audience sync sends an audience to a destination. Unlike regular syncs that map model columns, audience syncs manage membership lists with modes like mirror (full sync), additive (only add), and subtractive (only remove).
Identity Resolution
Identity resolution unifies customer records across data sources using a connected-components graph algorithm. You define identifier families (email, phone, device ID), merge rules, and limit rules. The output is a set of golden records — unified customer profiles.
Journeys
A journey is a multi-step, multi-channel workflow that customers move through based on triggers and conditions. Journeys use a visual canvas editor with tiles for entry, wait, branch, action, and exit.
Events
The events module handles real-time event collection (Segment-compatible API), schema enforcement via contracts, transformations, and forwarding to downstream tools.
Loaders
Loaders pull data from SaaS applications (Salesforce, HubSpot, Stripe, etc.) into your data warehouse on a schedule, acting as reverse ETL connectors.
Govern
Govern controls who can access what and how data flows. This includes RBAC (roles, permissions, groups), destination filters (which audiences can sync where), and access filters (row-level access control).
Insights
Insights provide operational dashboards for sync health, audience trends, activation coverage (how much of your audience is being reached), and audience overlap analysis.
Agent Smith
SignalSmith’s AI platform includes a natural language audience builder (describe an audience in plain English), an AI agent that can perform complex multi-step operations, and an MCP server for external AI tool integration.