Data Handling
This page details how SignalSmith processes, stores, transmits, and protects data across the platform. Understanding the data flow helps you assess SignalSmith’s fit for your security requirements and configure it appropriately for your environment.
Warehouse-Native Architecture
The most important thing to understand about SignalSmith’s data handling is the warehouse-native architecture: computation happens in your warehouse, and customer data stays under your control.
SignalSmith connects to your data warehouse (Snowflake, BigQuery, or Databricks) with credentials you provide. When it evaluates traits, builds audiences, or resolves identities, it generates SQL queries and executes them against your warehouse. The results are used to orchestrate syncs but are not stored persistently in SignalSmith’s own database.
What SignalSmith Stores
SignalSmith’s metadata database (PostgreSQL) stores:
- Configuration metadata — Model definitions, audience filter expressions, trait configurations, sync schedules, destination settings, journey canvases
- Encrypted credentials — Warehouse connection strings, destination OAuth tokens, API keys
- Operational state — Sync run status and summary metrics (row counts, error counts, timestamps), not the actual row-level data
- Audit events — User actions, API calls, sync run results, AI operations
- User accounts — Email addresses, roles, workspace memberships (authentication via Firebase/GCP Identity Platform)
What SignalSmith Does NOT Store
- Customer PII (names, emails, phone numbers, addresses)
- Transaction or behavioral data
- Raw data from your warehouse tables
- Full sync payloads (individual records being synced)
What Data Leaves the Warehouse
Data leaves your warehouse only during activation — when audience members and their mapped fields are sent to destinations. This is the core purpose of a CDP: getting the right data to the right tools.
Activation Data Flow
Warehouse ──SQL query──▶ SignalSmith ──API calls──▶ Destination
(your data) (orchestrator) (CRM, ads, email)The data that flows through SignalSmith during activation includes:
- Identifier fields — The fields used to match records in the destination (e.g., email address, external ID)
- Mapped attribute fields — Only the fields you explicitly map in the sync configuration (e.g., first name, lifetime value, segment name)
Data is streamed through SignalSmith during the sync run and is not persisted after the run completes. Only summary metrics (total rows synced, errors encountered) are retained.
Controlling What Data Leaves
You have full control over what data reaches each destination:
- Field mapping — Only fields you explicitly map are sent. Unmapped fields stay in the warehouse.
- Destination filters — Governance policies that restrict which audiences can sync to which destinations (learn more)
- Access Filters — Row-level access controls that limit which records are visible to which users and syncs (learn more)
- Data minimization — Send only the fields each destination needs. Don’t send full profiles when an email address suffices.
Encryption
In Transit
All network connections use TLS 1.2 or higher:
| Connection | Encryption |
|---|---|
| Browser to SignalSmith UI | TLS 1.2+ (HTTPS) |
| SignalSmith to your warehouse | TLS 1.2+ (enforced by warehouse provider) |
| SignalSmith to destinations | TLS 1.2+ (HTTPS API calls) |
| Internal service communication | TLS 1.2+ |
| MCP server connections | TLS 1.2+ |
At Rest
Sensitive data stored in SignalSmith’s metadata database is encrypted:
| Data Type | Encryption Method |
|---|---|
| Warehouse credentials | AES-256 encryption |
| Destination OAuth tokens | AES-256 encryption |
| Destination API keys | AES-256 encryption |
| User API keys | bcrypt hash (one-way, irreversible) |
Encryption keys are managed through your deployment’s key management configuration. In cloud deployments, keys are stored in a cloud KMS (Key Management Service).
Credential Storage
Credentials are treated as the most sensitive data in SignalSmith’s metadata store:
- Encrypted at rest — All credentials are encrypted with AES-256 before being written to the database
- Never logged — Credentials are excluded from application logs, error messages, and stack traces
- Never returned via API — API responses redact credential values. Once set, a credential can be updated but not read back.
- Access controlled — Only Admin-role users can view or modify credential configurations
- Rotation supported — Credentials can be updated without disrupting existing syncs (the new credential takes effect on the next run)
Warehouse Connection Credentials
For each supported warehouse:
| Warehouse | Credential Type | Storage |
|---|---|---|
| Snowflake | Username/password or key pair | AES-256 encrypted |
| BigQuery | Service account JSON key | AES-256 encrypted |
| Databricks | Personal access token | AES-256 encrypted |
Destination Credentials
| Auth Method | Storage |
|---|---|
| OAuth 2.0 tokens | AES-256 encrypted, auto-refreshed |
| API keys | AES-256 encrypted |
| Basic auth | AES-256 encrypted |
Audit Logging
SignalSmith maintains a comprehensive audit log of all actions performed in the platform. Every audit event records:
| Field | Description |
|---|---|
timestamp | When the action occurred (UTC) |
actor_email | The user who performed the action |
actor_id | The user’s unique identifier |
action | The action performed (e.g., create, update, delete, trigger, login) |
resource_type | The type of resource affected (e.g., audience, sync, destination) |
resource_id | The unique identifier of the affected resource |
details | Additional context about the action (parameters, changes, error messages) |
source | How the action was initiated (ui, api, ai_agent, schedule) |
workspace_id | The workspace context |
Audited Actions
| Category | Actions Logged |
|---|---|
| Authentication | Login, logout, API key creation, API key rotation |
| Warehouses | Create, update, delete, test connection |
| Models | Create, update, delete, preview |
| Audiences | Create, update, delete, estimate, evaluate |
| Syncs | Create, update, delete, trigger, run start, run complete, run error |
| Destinations | Create, update, delete, reconnect |
| AI operations | Agent messages, tool calls, guardrail triggers, approvals |
| Govern | Role changes, destination filter changes, access filter changes |
| Settings | Workspace settings changes, user invitations |
Audit Log Destinations
Audit events are written to two locations:
- SignalSmith internal store — Queryable via the UI (Settings > Audit Log) and API (
GET /api/v1/audit-log) - Your warehouse — Written to the
CDP_AUDIT.AUDIT_LOGtable, where you can query them with SQL, join with other data, and apply your own retention policies
Data Retention
SignalSmith supports configurable retention for operational data:
| Data Type | Default Retention | Configurable |
|---|---|---|
| Sync run history | 90 days | Yes |
| Audit log (internal) | 1 year | Yes |
| Audit log (warehouse) | Your warehouse retention policy | N/A (you control it) |
| Audience evaluation snapshots | 90 days | Yes |
| Event data | 30 days | Yes |
| AI conversation history | 30 days | Yes |
Retention settings are configurable per workspace in Settings > Data Retention. Data beyond the retention period is automatically purged from SignalSmith’s internal store. Data in your warehouse is governed by your own retention policies.
API Key Security
API keys provide programmatic access to SignalSmith’s REST API and MCP server.
Key Lifecycle
- Creation — An Admin generates a key in Settings > API Keys. The full key is displayed once and must be copied immediately.
- Storage — The key is hashed with bcrypt and stored. The original key cannot be retrieved.
- Usage — Pass the key in the
Authorization: Bearerheader. Each request is validated against the stored hash. - Rotation — Generate a new key and revoke the old one. Active syncs and integrations should be updated to use the new key.
- Revocation — Revoke a key immediately to block all requests using it.
Key Properties
| Property | Description |
|---|---|
| Prefix | Keys start with sk_live_ (production) or sk_test_ (development) |
| Scope | Each key is scoped to a single workspace |
| Permissions | Keys inherit the permissions of the user who created them |
| Last used | The dashboard shows when each key was last used |
| Expiration | Optional expiration date can be set at creation |
Related Resources
- Compliance — GDPR, CCPA, and SOC 2 compliance support
- Govern — RBAC, destination filters, and access filters
- AI Guardrails — Safety controls for AI operations
- API Reference — API authentication details