LoadersShopify

Shopify Loader

The Shopify loader pulls e-commerce data from your Shopify store into your data warehouse. It uses Shopify’s Admin REST API and GraphQL API, and supports incremental sync via the updated_at timestamp for most resources.

Prerequisites

  • A Shopify store (any plan, including development stores)
  • A Shopify user with Admin or Staff access (with appropriate permissions)
  • A connected Warehouse (target warehouse) with write permissions on the target schema

Authentication

The Shopify loader uses OAuth 2.0 for authentication.

OAuth 2.0 Setup

  1. In SignalSmith, click Add Loader and select Shopify
  2. Enter your Shopify store domain (e.g., mystore.myshopify.com)
  3. Click Connect with Shopify
  4. You’ll be redirected to your Shopify admin to install the SignalSmith app
  5. Review the requested permissions and click Install app
  6. You’ll be redirected back to SignalSmith with the connection established

SignalSmith requests the following access scopes:

ScopePurpose
read_ordersAccess order, fulfillment, and refund data
read_productsAccess product, variant, and collection data
read_customersAccess customer records
read_inventoryAccess inventory levels and locations
read_analyticsAccess store analytics data
read_marketing_eventsAccess marketing event data

All scopes are read-only. SignalSmith never modifies data in your Shopify store.

Store Domain

Enter your Shopify store’s .myshopify.com domain, not your custom domain. For example, if your store is at shop.mycompany.com, use mycompany.myshopify.com.

Available Objects

ObjectAPI NameDescriptionDefault Sync Mode
OrdersordersOrders with line items, discounts, shipping, and payment detailsIncremental
ProductsproductsProduct catalog with title, description, and tagsIncremental
Product VariantsvariantsIndividual product variants (size, color) with SKU, price, and inventoryIncremental
CustomerscustomersCustomer records with contact info, order history, and tagsIncremental
CollectionscollectionsProduct collections (smart and custom)Full Refresh
Inventory Itemsinventory_itemsInventory records tied to variantsIncremental
Inventory Levelsinventory_levelsStock quantities per variant per locationFull Refresh
LocationslocationsStore locations, fulfillment centers, and warehousesFull Refresh
FulfillmentsfulfillmentsShipment records tied to ordersIncremental
RefundsrefundsRefund records with line items and amountsIncremental
TransactionstransactionsPayment transactions associated with ordersIncremental
Discount Codesdiscount_codesDiscount code definitionsFull Refresh
Abandoned CheckoutscheckoutsAbandoned checkout sessionsIncremental

Orders and Line Items

Orders are extracted with their line items in a parent-child relationship. The orders table contains order-level data (totals, customer, shipping address), while a separate order_line_items table contains one row per item in the order. Join these on order_id.

Financial details included per order:

FieldDescription
subtotal_priceSum of line item prices before taxes and shipping
total_taxTotal tax amount
total_shippingShipping charges
total_discountsTotal discount amount applied
total_priceFinal price charged to the customer
currencyOrder currency code (e.g., USD, EUR)

Customer Data

Customer records include:

  • Contact information (email, phone)
  • Default address
  • Order count and total spent (lifetime)
  • Tags and notes
  • Marketing consent status
  • Account status (enabled, disabled, invited)

Configuration

SettingDescriptionDefault
Store DomainYour .myshopify.com domain— (required)
ObjectsList of objects to sync— (you choose during setup)
Sync ModeFull Refresh or Incremental (per object)Incremental
Cursor FieldField used for incremental syncupdated_at
Primary KeyField(s) that uniquely identify a recordid
Include Archived OrdersWhether to sync cancelled/archived orderstrue
Target SchemaWarehouse schema for Shopify tables— (required)
Table PrefixOptional prefix for table namesshopify_
ScheduleSync frequencyHourly

Scheduling Notes

  • Rate limits: Shopify’s API uses a leaky bucket rate limiter. Standard plans allow 2 requests per second; Shopify Plus allows higher limits. SignalSmith manages rate limiting automatically with backoff and credit tracking.
  • GraphQL cost limits: For high-volume stores, SignalSmith uses the GraphQL Admin API, which has a cost-based rate limit (1,000 points per second). Complex queries consume more points. SignalSmith optimizes queries to minimize cost.
  • Order volume: High-volume stores may have millions of orders. The initial backfill can take several hours. Subsequent incremental syncs are much faster.
  • Deleted records: Shopify’s API does not include permanently deleted records in list endpoints. Deleted products and customers are not captured by incremental sync. Run periodic full refreshes if you need to detect deletions.
  • Multi-currency: If your store uses Shopify Payments with multi-currency, order amounts are stored in the customer’s presentment currency. The presentment_currency and shop_currency fields help with conversion.

Schema Mapping

Shopify field types are mapped to warehouse-compatible types:

Shopify TypeWarehouse TypeNotes
stringVARCHAR
integerBIGINTIDs, counts
decimalDOUBLEMonetary amounts (already in currency units, not cents)
booleanBOOLEAN
datetimeTIMESTAMPUTC normalized from ISO 8601
arrayJSON / VARCHARTags (comma-separated), line items
objectJSON / VARCHARNested structures like addresses, shipping lines

Monetary Values

Unlike Stripe, Shopify returns monetary values in standard currency units (e.g., 29.99 for $29.99), not cents. No division is needed in your models.

Troubleshooting

IssueSolution
”401 Unauthorized”The app may have been uninstalled from Shopify. Reconnect from the loader settings
”403 Forbidden” on specific resourcesThe app may lack the required access scope. Reconnect to re-request permissions
”429 Too Many Requests”SignalSmith handles rate limits automatically. If persistent, reduce the number of objects synced simultaneously
Missing recent ordersIncremental sync uses updated_at. Very recent orders may appear on the next run
Inventory levels are staleInventory levels use Full Refresh mode. Increase sync frequency if real-time accuracy is needed
Line items missing from ordersLine items are in a separate table (order_line_items). Join on order_id
Customer data missing for ordersSome orders may be from guest checkouts without a customer record. Check for null customer_id values
Product variants not appearingVariants are a separate object from products. Ensure you’ve selected “Product Variants” in addition to “Products”

Next Steps

  • Create a model to transform your raw Shopify data
  • Build customer lifetime value models by joining customers, orders, and transactions
  • Build audiences based on purchase behavior