LoadersOverview

Loaders

Loaders pull data from SaaS applications and third-party services into your data warehouse on a recurring schedule. Instead of building and maintaining custom ETL pipelines, you configure a loader in SignalSmith, and it handles extraction, schema mapping, and incremental synchronization automatically.

What Is a Loader?

A Loader is a managed ingestion pipeline that connects to a SaaS application’s API, extracts data from one or more objects (e.g., Salesforce Contacts, Stripe Subscriptions), and writes it into your data warehouse. Once the data is in your warehouse, it becomes available to Warehouses, Models, and the rest of the SignalSmith platform.

Loaders complement Warehouses. While a Warehouse gives SignalSmith read access to data already in your warehouse, a Loader brings external data into your warehouse in the first place.

Supported Connectors

SignalSmith supports 15+ loader connectors across CRM, marketing automation, advertising, payments, support, and productivity categories.

ConnectorCategoryAuthenticationKey Objects
SalesforceCRMOAuth 2.0Contacts, Leads, Accounts, Opportunities
HubSpotCRMOAuth 2.0 / API KeyContacts, Companies, Deals
StripePaymentsAPI KeyCustomers, Charges, Subscriptions
ZendeskSupportAPI TokenTickets, Users, Organizations
IntercomSupportOAuth 2.0Contacts, Companies, Conversations
MarketoMarketingOAuth 2.0Leads, Lists, Programs, Activities
Google AdsAdvertisingOAuth 2.0Campaigns, Ad Groups, Performance
Facebook AdsAdvertisingOAuth 2.0Campaigns, Ad Sets, Insights
LinkedIn AdsAdvertisingOAuth 2.0Campaigns, Creatives, Analytics
ShopifyE-commerceOAuth 2.0Orders, Products, Customers
GitHubDeveloperOAuth 2.0 / PATRepos, Issues, Pull Requests
JiraProject ManagementOAuth 2.0 / API TokenIssues, Projects, Sprints
Google SheetsProductivityOAuth 2.0Spreadsheet Tabs
SlackCollaborationOAuth 2.0Messages, Channels, Users

How Loaders Work

1. Connect

Authenticate with the source application using OAuth 2.0, API keys, or access tokens. SignalSmith securely stores credentials and handles token refresh automatically.

2. Discover

Once connected, SignalSmith discovers the available objects and streams from the application’s API. You select which objects to sync — there’s no need to extract everything.

3. Map

SignalSmith automatically maps the source application’s schema to warehouse-compatible table definitions. Each selected object becomes a table in your target schema. You can customize column names, types, and which fields to include or exclude.

4. Schedule

Configure a sync schedule that determines how frequently data is pulled. Options range from every 15 minutes to daily. SignalSmith uses incremental sync by default, pulling only records that have changed since the last run.

5. Load

On each scheduled run, SignalSmith extracts changed records from the source API, transforms them into the target schema, and writes them to your warehouse. Failed runs are automatically retried, and you can monitor progress from the Loaders dashboard.

Sync Modes

Loaders support two primary sync modes:

ModeDescriptionUse Case
Full RefreshReplaces the entire table with a fresh extract on each runSmall reference tables, lookup data, or when the source API doesn’t support change tracking
IncrementalPulls only new and updated records since the last successful syncLarge transaction tables, event streams, or any dataset with a reliable updated_at timestamp

Incremental sync uses a cursor field (usually a timestamp like updated_at or modified_date) to track progress. SignalSmith persists the cursor value between runs, so each execution picks up exactly where the last one left off.

Target Warehouse Configuration

Loader data is written to a schema in your connected data warehouse. You configure the target location when creating a loader:

SettingDescriptionExample
SourceThe warehouse source connection to write intoProduction Snowflake
SchemaThe target schema for loader tablesSALESFORCE_RAW, HUBSPOT_DATA
Table PrefixOptional prefix applied to all table namessf_, hs_

SignalSmith creates tables automatically in the target schema. If tables already exist, the loader appends or merges data based on the sync mode.

Scheduling

Loaders run on configurable schedules:

IntervalDescription
Every 15 minutesNear real-time for critical data
HourlyGood balance of freshness and API usage
Every 6 hoursSuitable for most operational data
DailyBest for reference data or high-volume extracts
Custom cronFull cron expression for advanced scheduling

All schedules are evaluated in UTC. You can pause and resume loaders at any time without losing cursor state.

Monitoring

Each loader run produces a detailed execution log:

  • Status — Success, Failed, or Running
  • Records extracted — Number of records pulled from the source API
  • Records loaded — Number of records written to the warehouse
  • Duration — Wall-clock time of the run
  • Errors — Any API errors, rate limit hits, or schema conflicts

You can view run history from the Loaders dashboard or query the API for programmatic monitoring.

API Reference

Loaders are managed through the SignalSmith REST API:

# List all loaders
GET /api/v1/loaders
 
# Get a single loader
GET /api/v1/loaders/{id}
 
# Create a loader
POST /api/v1/loaders
 
# Update a loader
PUT /api/v1/loaders/{id}
 
# Delete a loader
DELETE /api/v1/loaders/{id}
 
# Trigger a manual run
POST /api/v1/loaders/{id}/run
 
# Get run history
GET /api/v1/loaders/{id}/runs

See the API Reference for full request/response schemas.

Best Practices

  • Use a dedicated schema — Write loader data into a separate schema (e.g., SALESFORCE_RAW) to keep it isolated from your curated models and analytics tables.
  • Start with incremental sync — Incremental mode is faster, cheaper, and puts less load on both the source API and your warehouse.
  • Monitor API quotas — Some source applications have API rate limits. If you’re loading many objects at high frequency, check that your API plan supports the volume.
  • Schedule off-peak — For large full-refresh loads, schedule runs during off-peak hours to minimize impact on your warehouse.
  • Use table prefixes — If multiple loaders write to the same schema, use prefixes to avoid naming collisions and make tables easy to identify.

Next Steps