Creating a Loader

This guide walks you through setting up a new loader in SignalSmith. A loader connects to a SaaS application, extracts data from selected objects, and writes it into your data warehouse on a recurring schedule.

Prerequisites

Before creating a loader, ensure you have:

A SignalSmith workspace with appropriate permissions (Owner, Admin, or a role with loaders:create permission)
A configured Warehouse (the target warehouse where loader data will be written)
Credentials for the SaaS application you want to connect (OAuth access or API key)
Write permissions on the target schema in your warehouse

Step-by-Step Guide

Step 1: Navigate to Loaders

Log in to your SignalSmith workspace
Click Loaders in the left sidebar
Click the Add Loader button in the top-right corner

Step 2: Select the Source Application

Choose the SaaS application you want to pull data from. SignalSmith supports 15+ connectors across CRM, marketing, advertising, payments, support, and productivity categories.

Each connector has its own authentication flow and available objects. See the individual connector guides for detailed setup instructions.

Step 3: Authenticate

Depending on the connector, you’ll authenticate using one of the following methods:

Method	Flow	Connectors
OAuth 2.0	Click “Connect” and authorize via the application’s login page. SignalSmith handles token storage and refresh.	Salesforce, HubSpot, Google Ads, Facebook Ads, LinkedIn Ads, Shopify, Intercom, GitHub, Slack
API Key	Paste your API key or secret directly into the configuration form.	Stripe, HubSpot (alternative), Zendesk
OAuth 2.0 Client Credentials	Provide your client ID and client secret. SignalSmith exchanges them for an access token.	Marketo
API Token	Provide a personal or workspace API token along with your account identifier.	Zendesk, Jira
Personal Access Token (PAT)	Generate a token in the application’s developer settings and paste it into SignalSmith.	GitHub (alternative), Jira (alternative)

All credentials are encrypted at rest using AES-256 encryption.

Step 4: Discover and Select Objects

After authentication, SignalSmith discovers the available objects and streams from the connected application. You’ll see a list of all available objects with metadata:

Object name — The API name of the object (e.g., Contact, deals, charges)
Record count — Estimated number of records (when available from the API)
Sync mode — Whether the object supports incremental sync or requires full refresh
Cursor field — The field used for incremental sync (e.g., updated_at, SystemModstamp)

Select the objects you want to sync. You don’t need to select everything — choose only the objects that are relevant to your use case to minimize API usage and warehouse storage.

For each selected object, you can optionally:

Include/exclude fields — Deselect fields you don’t need to reduce table width and storage
Rename the target table — Override the default table name in your warehouse
Set primary key — Specify which field(s) uniquely identify a record for deduplication

Step 5: Configure the Target Warehouse

Specify where loader data should be written:

Setting	Description	Example
Target Source	The warehouse source connection to write into	`Production Snowflake`
Target Schema	The schema where tables will be created	`SALESFORCE_RAW`
Table Prefix	Optional prefix for all table names created by this loader	`sf_`

SignalSmith creates tables automatically. If a table already exists, the loader merges data based on the sync mode and primary key configuration.

Step 6: Configure the Schedule

Set the sync frequency:

Interval	Best For
Every 15 minutes	Critical operational data (e.g., support tickets, orders)
Hourly	General-purpose, balances freshness with API efficiency
Every 6 hours	Operational data that doesn’t need real-time freshness
Daily	Reference data, large full-refresh tables, cost-sensitive workloads
Custom cron	Advanced scheduling needs (e.g., `0 2 * * 1-5` for weekday 2 AM runs)

All schedules are evaluated in UTC. You can also choose to leave the schedule paused and trigger runs manually.

Step 7: Name and Save

Give your loader a descriptive name (e.g., “Salesforce Production” or “Stripe Payments”) and click Save.

SignalSmith will:

Create the target tables in your warehouse
Run an initial full sync to backfill historical data
Begin the recurring schedule for incremental syncs

The initial sync may take longer depending on data volume. You can monitor progress from the loader’s detail page.

Using the API

You can create loaders programmatically via the REST API:

curl -X POST https://your-workspace.signalsmith.dev/api/v1/loaders \
  -H "Authorization: Bearer $API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Salesforce Production",
    "connector": "salesforce",
    "auth": {
      "type": "oauth2",
      "access_token": "...",
      "refresh_token": "...",
      "instance_url": "https://mycompany.my.salesforce.com"
    },
    "objects": [
      {
        "name": "Contact",
        "sync_mode": "incremental",
        "cursor_field": "SystemModstamp",
        "primary_key": ["Id"]
      },
      {
        "name": "Account",
        "sync_mode": "incremental",
        "cursor_field": "SystemModstamp",
        "primary_key": ["Id"]
      }
    ],
    "target": {
      "source_id": "src_abc123",
      "schema": "SALESFORCE_RAW",
      "table_prefix": "sf_"
    },
    "schedule": {
      "interval": "hourly"
    }
  }'

The API response includes the created loader with its id and initial run status:

{
  "id": "ldr_xyz789",
  "name": "Salesforce Production",
  "connector": "salesforce",
  "status": "running",
  "objects": ["Contact", "Account"],
  "schedule": {
    "interval": "hourly",
    "next_run_at": "2025-01-15T12:00:00Z"
  },
  "created_at": "2025-01-15T11:00:00Z"
}

Managing Loaders

Editing a Loader

To modify an existing loader:

Navigate to Loaders in the sidebar
Click on the loader you want to edit
Modify objects, schedule, or target configuration
Click Save

Adding new objects triggers a backfill for those objects. Removing objects does not delete the corresponding warehouse tables — you must drop them manually if desired.

Pausing and Resuming

You can pause a loader to temporarily stop scheduled runs without losing cursor state. Click Pause on the loader’s detail page. Resume when ready — the next run picks up exactly where it left off.

Manual Runs

Click Run Now to trigger an immediate sync outside the regular schedule. This is useful for testing configuration changes or backfilling after a pause.

Deleting a Loader

Deleting a loader stops all scheduled runs and removes the loader configuration. Existing data in your warehouse is not affected — tables and data remain until you manually clean them up.

Common Issues

Issue	Solution
OAuth token expired	Re-authenticate by clicking “Reconnect” on the loader detail page
API rate limit exceeded	Reduce sync frequency or select fewer objects
Target schema doesn’t exist	Create the schema in your warehouse before saving the loader
”Permission denied” writing to warehouse	Ensure the source connection user has write access to the target schema
Initial sync taking too long	Large datasets may take hours for the initial backfill — subsequent incremental syncs will be much faster
Missing records after sync	Verify the cursor field is correctly set and that the source API returns expected data for the date range

Next Steps

Choose a connector: Salesforce | HubSpot | Stripe | Zendesk
Monitor your data pipeline from the Loaders dashboard
Create a model using the loaded data

Overview Salesforce