SyncsOverview

Syncs

Syncs are the data movement engine in SignalSmith. A sync connects a Model (your data) to a Destination (where you want to send it) and runs on a schedule to keep the destination up to date.

What Is a Sync?

A sync is a configured pipeline that:

  1. Executes a model’s SQL query against your warehouse
  2. Compares the results to the previous run to detect changes (new, updated, deleted records)
  3. Sends those changes to the destination using the appropriate API
  4. Records the results (rows synced, errors, duration) for monitoring

Syncs are the final step in SignalSmith’s data pipeline:

Source → Model → Sync → Destination

Key Concepts

Sync Mode

The sync mode determines how SignalSmith handles data at the destination:

ModeBehavior
UpsertCreate new records or update existing ones. Never deletes.
MirrorKeep the destination in perfect sync — creates, updates, and deletes records.
AppendOnly add new records. Never updates or deletes.

Field Mapping

Field mapping defines how columns from your model map to fields in the destination. For example, the model column email might map to the Salesforce field Email, or the model column lifetime_value might map to a HubSpot property ltv.

Scheduling

Scheduling determines when and how often the sync runs. Options include cron expressions for precise timing and interval-based schedules for regular cadences.

Sync Runs

Each execution of a sync produces a sync run record with detailed information about what happened — how many rows were processed, how many were created/updated/deleted, any errors, and the total duration.

How Syncs Work

Step 1: Query Execution

When a sync runs, SignalSmith executes the model’s SQL query against the source warehouse. The full result set is retrieved and staged for comparison.

Step 2: Change Detection

SignalSmith compares the current result set with the previous run’s snapshot using the model’s primary key:

  • New rows — Primary key values present in the current run but not the previous run
  • Updated rows — Primary key values present in both runs but with different attribute values
  • Deleted rows — Primary key values present in the previous run but not the current run (relevant for mirror mode only)

Step 3: Destination Writes

Based on the sync mode, SignalSmith sends the appropriate operations to the destination API:

  • Creates — New records are inserted into the destination
  • Updates — Existing records are updated with changed attribute values
  • Deletes — Records are removed from the destination (mirror mode only)

Operations are sent in batches for efficiency, with automatic retry logic for transient failures.

Step 4: Results Recording

After the sync completes, SignalSmith records:

  • Total rows processed
  • Rows created, updated, and deleted
  • Rows that failed with errors
  • Total duration
  • Any error details for failed rows

Sync Lifecycle

1. Created

The sync is configured with a model, destination, field mapping, sync mode, and schedule. It is saved but has not yet run.

2. Active

The sync is running on its configured schedule. Each execution produces a sync run record.

3. Paused

The sync is temporarily stopped. No new runs are triggered, but historical run data is preserved. You can resume the sync at any time.

4. Error State

If a sync encounters persistent failures (e.g., destination authentication expired, model query fails), it enters an error state. The sync is effectively paused until the underlying issue is resolved.

5. Archived

The sync is no longer needed and has been archived. It can be restored if needed.

API Reference

Syncs are managed through the SignalSmith REST API:

# List all syncs
GET /api/v1/syncs
 
# Get a single sync
GET /api/v1/syncs/{id}
 
# Create a sync
POST /api/v1/syncs
 
# Update a sync
PUT /api/v1/syncs/{id}
 
# Delete a sync
DELETE /api/v1/syncs/{id}
 
# Trigger a manual run
POST /api/v1/syncs/{id}/trigger
 
# Pause a sync
POST /api/v1/syncs/{id}/pause
 
# Resume a sync
POST /api/v1/syncs/{id}/resume
 
# List sync runs
GET /api/v1/syncs/{id}/runs
 
# Get a specific run
GET /api/v1/syncs/{id}/runs/{run_id}

Example: Create a Sync

curl -X POST https://your-workspace.signalsmith.dev/api/v1/syncs \
  -H "Authorization: Bearer $API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Active Customers to Salesforce",
    "model_id": "mod_abc123",
    "destination_id": "dst_xyz789",
    "mode": "upsert",
    "schedule": {
      "type": "interval",
      "interval_minutes": 60
    },
    "field_mappings": [
      {"source": "email", "destination": "Email"},
      {"source": "first_name", "destination": "FirstName"},
      {"source": "last_name", "destination": "LastName"},
      {"source": "lifetime_value", "destination": "LTV__c"}
    ]
  }'

Best Practices

  • Start with manual runs — Before setting up a schedule, trigger a manual run to verify the sync works correctly with your field mapping
  • Use upsert mode first — Upsert is the safest mode. Only use mirror mode when you’re confident about deletes.
  • Monitor initial runs — Watch the first few sync runs to catch mapping issues or unexpected data transformations
  • Set appropriate schedules — Match the sync frequency to your business needs. Not everything needs to sync every hour.
  • Handle errors promptly — Review failed rows and fix the underlying issues (data format mismatches, missing required fields, API limits)
  • Use descriptive names — Name syncs after their purpose: “CRM Contacts - Daily” or “Google Ads Audience - Hourly”

Next Steps