Jira Loader
The Jira loader pulls project management data from your Jira Cloud or Jira Data Center instance into your data warehouse. It uses Jira’s REST API and supports incremental sync via the updated timestamp on issues.
Prerequisites
- A Jira Cloud or Jira Data Center instance
- A Jira user with access to the projects you want to sync
- A connected Warehouse (target warehouse) with write permissions on the target schema
Authentication
The Jira loader supports two authentication methods.
OAuth 2.0 (Jira Cloud)
- In SignalSmith, click Add Loader and select Jira
- Choose OAuth 2.0 as the authentication method
- Click Connect with Jira
- You’ll be redirected to Atlassian’s authorization page
- Select the Jira site you want to connect
- Grant SignalSmith access and click Accept
- You’ll be redirected back to SignalSmith with the connection established
SignalSmith requests the following scopes:
| Scope | Purpose |
|---|---|
read:jira-work | Read issues, projects, boards, and sprints |
read:jira-user | Read user profiles |
API Token (Jira Cloud or Data Center)
For environments where OAuth isn’t practical:
Jira Cloud:
- Go to Atlassian API Token Management
- Click Create API Token
- Label the token (e.g., “SignalSmith Loader”)
- Copy the generated token
- In SignalSmith, provide:
| Field | Description | Example |
|---|---|---|
| Site URL | Your Jira Cloud URL | https://mycompany.atlassian.net |
| Your Atlassian account email | user@mycompany.com | |
| API Token | The generated API token | aBcDeFgHiJ... |
Jira Data Center:
- In Jira, go to your Profile > Personal Access Tokens
- Click Create Token
- Name the token and set an expiry (or no expiry)
- Copy the generated token
- In SignalSmith, provide:
| Field | Description | Example |
|---|---|---|
| Site URL | Your Jira Data Center URL | https://jira.mycompany.com |
| Personal Access Token | The generated PAT | MDM2OTk5... |
Available Objects
| Object | API Name | Description | Default Sync Mode |
|---|---|---|---|
| Issues | issues | Issues with summary, description, status, assignee, and custom fields | Incremental |
| Projects | projects | Project definitions with lead, category, and type | Full Refresh |
| Sprints | sprints | Scrum sprints with dates and completion status | Incremental |
| Boards | boards | Scrum and Kanban board configurations | Full Refresh |
| Users | users | Jira user profiles | Full Refresh |
| Issue Types | issue_types | Issue type definitions (Bug, Story, Task, Epic, etc.) | Full Refresh |
| Statuses | statuses | Workflow status definitions | Full Refresh |
| Priorities | priorities | Priority level definitions | Full Refresh |
| Resolutions | resolutions | Resolution type definitions | Full Refresh |
| Components | components | Project component definitions | Full Refresh |
| Versions | versions | Project version/release definitions | Full Refresh |
| Worklogs | worklogs | Time tracking entries on issues | Incremental |
| Issue Links | issue_links | Relationships between issues (blocks, relates to, duplicates) | Incremental |
| Issue Changelog | issue_changelog | Field change history for issues (status transitions, reassignments) | Incremental |
Issues and Custom Fields
Jira issues are the primary data object. Each issue includes:
- Standard fields: summary, description, status, assignee, reporter, priority, issue type, created/updated dates
- Custom fields: Any custom field defined in your Jira instance is extracted as an additional column
Custom fields are named using their field ID (e.g., customfield_10001). The issue_types and field metadata tables provide human-readable labels for mapping.
Issue Changelog
The issue changelog captures every field change on an issue — status transitions, assignee changes, priority updates, sprint changes, and more. Each changelog entry includes:
field— The field that changedfrom_value/to_value— The previous and new valuesauthor— Who made the changecreated— When the change occurred
This data is essential for building cycle time, lead time, and flow metrics.
Sprints
Sprints are extracted from Jira Software boards. Each sprint includes:
- Sprint name, start date, end date, and completion date
- State (future, active, closed)
- Board association
Sprint-to-issue assignments are tracked via the sprint custom field on issues.
Configuration
| Setting | Description | Default |
|---|---|---|
| Site URL | Your Jira instance URL | — (required) |
| Auth Method | OAuth 2.0, API Token, or PAT | OAuth 2.0 |
| Projects | Jira project keys to sync (empty = all accessible projects) | All |
| Objects | List of objects to sync | — (you choose during setup) |
| Sync Mode | Full Refresh or Incremental (per object) | Incremental |
| Cursor Field | Field used for incremental sync | updated |
| Primary Key | Field(s) that uniquely identify a record | id |
| Include Changelog | Whether to extract issue change history | true |
| JQL Filter | Optional JQL to limit which issues are synced | — (optional) |
| Target Schema | Warehouse schema for Jira tables | — (required) |
| Table Prefix | Optional prefix for table names | jira_ |
| Schedule | Sync frequency | Hourly |
JQL Filter
You can optionally provide a JQL (Jira Query Language) expression to limit which issues are synced. This is useful for large Jira instances where you only need an access filter of projects or issue types.
Examples:
project IN (ENG, PLATFORM) AND issuetype IN (Bug, Story, Task)
created >= -365d
project = ENG AND status != DoneScheduling Notes
- Rate limits: Jira Cloud enforces rate limits per user. SignalSmith handles rate limiting with automatic backoff. For large instances with many issues, the initial backfill may take several runs to complete.
- Pagination: Jira’s REST API returns a maximum of 100 results per page (50 for search). SignalSmith handles pagination automatically.
- Custom field performance: Instances with many custom fields may have slower API responses. Consider limiting the custom fields extracted if you don’t need all of them.
- Changelog volume: The issue changelog can be very high-volume for actively worked issues. Consider disabling changelog extraction if you don’t need flow metrics.
- Jira Data Center: Data Center instances may have different API behavior and rate limits compared to Jira Cloud. Ensure your Data Center version is 8.x or later for full API compatibility.
Schema Mapping
Jira field types are mapped to warehouse-compatible types:
| Jira Type | Warehouse Type | Notes |
|---|---|---|
string | VARCHAR | |
number | DOUBLE | Story points, numeric custom fields |
datetime | TIMESTAMP | UTC normalized from ISO 8601 |
date | DATE | Due dates, sprint dates |
user | VARCHAR | Stored as user account ID or username |
option | VARCHAR | Single-select custom fields |
array | JSON / VARCHAR | Labels, components, fix versions |
issuetype | VARCHAR | Stored as the issue type name |
status | VARCHAR | Stored as the status name |
priority | VARCHAR | Stored as the priority name |
resolution | VARCHAR | Stored as the resolution name |
Troubleshooting
| Issue | Solution |
|---|---|
| ”401 Unauthorized” | Token has expired or is invalid. Re-authenticate or regenerate the API token |
| ”403 Forbidden” on specific projects | The authenticated user lacks access to the project. Verify project permissions in Jira |
| ”429 Rate limit exceeded” | SignalSmith handles rate limits automatically. Reduce sync frequency if persistent |
Custom fields showing as customfield_XXXXX | Join with the field metadata to get human-readable names, or use Jira admin to look up field IDs |
| Missing sprints | Sprints require Jira Software (not Jira Work Management). Verify the board type is Scrum |
| Changelog table is very large | Disable changelog extraction if you don’t need flow metrics, or filter to specific projects |
| JQL filter syntax error | Validate your JQL in Jira’s issue search before entering it in SignalSmith |
| Data Center connection timeout | Ensure your Jira Data Center instance is accessible from SignalSmith’s IP addresses |
Next Steps
- Create a model to transform your raw Jira data
- Build engineering metrics dashboards (cycle time, throughput, sprint velocity)
- Correlate engineering activity with customer outcomes using other loader data