InsightsSync Health

Sync Health

The Sync Health dashboard provides a composite health score for every sync in your workspace, combining success rate, error trends, latency, and data freshness into a single at-a-glance view. Use it to identify failing syncs, diagnose degradation, and monitor throughput and SLA compliance.

Health Score

Each sync receives a health score from 0 to 100 based on four weighted components:

ComponentWeightWhat It Measures
Success Rate40%Percentage of recent runs that completed without errors
Error Trend20%Whether errors are increasing, stable, or decreasing over the last 7 days
Latency20%Whether run duration is within expected bounds based on historical averages
Data Freshness20%Time since the last successful sync run completed

Score Ranges

ScoreStatusMeaning
80–100HealthySync is operating normally. No action needed.
60–79WarningSync is showing signs of degradation. Investigate soon.
40–59DegradedSync has significant issues. Immediate investigation recommended.
0–39CriticalSync is failing or severely impaired. Requires urgent attention.

Reading the Dashboard

The Sync Health dashboard displays all syncs as a sortable table with visual health indicators:

  • Green — Healthy (80+)
  • Yellow — Warning (60–79)
  • Orange — Degraded (40–59)
  • Red — Critical (0–39)

Click any sync row to drill into its detail view with historical health score trends, individual run results, and per-component breakdowns.

Throughput Metrics

Throughput measures the data processing rate of each sync in rows per second. This metric helps you understand:

  • Baseline performance — What normal throughput looks like for each sync
  • Performance degradation — When throughput drops significantly from the baseline
  • Capacity planning — Whether your warehouse compute needs scaling for growing data volumes

Throughput Indicators

MetricDescription
Current ThroughputRows/second for the most recent sync run
Average ThroughputRolling 7-day average rows/second
Peak ThroughputHighest rows/second recorded in the selected time window
Throughput TrendWhether throughput is increasing, stable, or decreasing

A significant drop in throughput (more than 30% below the rolling average) triggers a warning indicator, as it may signal warehouse contention, data volume spikes, or query plan changes.

Duration Tracking

Duration tracking shows how long each sync run takes to complete, helping you identify slow-running syncs and plan scheduling to avoid conflicts.

Duration Metrics

MetricDescription
Last Run DurationWall-clock time for the most recent completed run
Average DurationRolling 7-day average duration
P95 Duration95th percentile duration — shows worst-case performance
Duration TrendWhether runs are getting faster, stable, or slower

Duration Breakdown

Each sync run duration is broken down into phases:

  1. Query Phase — Time spent executing SQL against your warehouse to evaluate the audience or model
  2. Diff Phase — Time spent comparing the current result set to the previous run to determine adds, updates, and removes
  3. Write Phase — Time spent sending data to the destination API
  4. Commit Phase — Time spent recording the run result and updating audit tables

The phase breakdown helps pinpoint bottlenecks. For example, a long Query Phase suggests warehouse optimization opportunities, while a long Write Phase may indicate destination API rate limiting.

SLA Compliance

Define freshness and latency targets for your syncs, and the SLA compliance dashboard tracks adherence over time.

Defining SLA Targets

Configure SLA targets per sync or use workspace-level defaults:

TargetDescriptionExample
Max FreshnessMaximum acceptable time since last successful run6 hours
Max DurationMaximum acceptable run duration30 minutes
Min Success RateMinimum acceptable success rate over a rolling window95% over 7 days

Compliance Tracking

The SLA compliance view shows:

  • Current compliance status — Whether each sync is currently meeting its SLA targets
  • Compliance percentage — What fraction of the selected time period each sync met its SLA (e.g., “99.2% compliant over 30 days”)
  • Breach history — Timeline of SLA breaches with duration and root cause indicators
  • At-risk syncs — Syncs that haven’t breached yet but are trending toward a breach

Identifying and Diagnosing Failing Syncs

When a sync’s health score drops, follow this diagnostic workflow:

Step 1: Check the Health Components

Navigate to the sync’s detail view and look at which component is driving the score down:

  • Low Success Rate — Recent runs are failing. Check the error messages on the failed runs.
  • Rising Error Trend — Errors are increasing even if the current success rate looks acceptable. This indicates an emerging issue.
  • High Latency — Runs are taking longer than expected. Check the duration breakdown for the bottleneck phase.
  • Stale Freshness — The sync hasn’t completed a successful run recently. Check if the sync schedule is running and whether runs are stuck.

Step 2: Inspect Recent Runs

Click into the Sync Runs tab to see individual run details:

  • Error messages — Look for specific error codes and messages from the destination API or warehouse
  • Row counts — Compare member counts across runs to spot unexpected spikes or drops
  • Duration breakdown — Identify which phase is contributing to slowness

Step 3: Common Issues and Resolutions

SymptomLikely CauseResolution
Runs failing with auth errorsOAuth token expired or revokedReconnect the destination
Runs failing with rate limit errorsDestination API throttlingReduce sync frequency or increase batch size
Runs timing out in Query PhaseWarehouse query performance degradationCheck warehouse resource usage and query plan
Runs succeeding but with high error countsSome records failing validation at destinationCheck field mapping for type mismatches or required fields
Freshness breached but runs succeedingSchedule frequency is too low for the SLA targetIncrease sync frequency

API Endpoints

Get Sync Health Scores

GET /api/v1/insights/sync-health

Returns health scores and component breakdowns for all syncs in the workspace.

Query Parameters:

ParameterTypeDescription
sync_idstringFilter to a specific sync (optional)
statusstringFilter by health status: healthy, warning, degraded, critical (optional)
sortstringSort by score, name, last_run (default: score)

Get Sync Throughput

GET /api/v1/insights/sync-throughput

Returns throughput metrics for syncs over a time window.

Query Parameters:

ParameterTypeDescription
sync_idstringFilter to a specific sync (optional)
windowstringTime window: 1d, 7d, 30d (default: 7d)

Get Sync Duration

GET /api/v1/insights/sync-duration

Returns duration metrics and phase breakdowns.

Get SLA Compliance

GET /api/v1/insights/sla-compliance

Returns SLA compliance status and breach history.

Query Parameters:

ParameterTypeDescription
sync_idstringFilter to a specific sync (optional)
windowstringCompliance window: 7d, 30d, 90d (default: 30d)