PII Masking
PII (Personally Identifiable Information) masking provides column-level controls over how sensitive data is exposed across SignalSmith. Each column in a model can be assigned a sensitivity level that determines its visibility in previews, value suggestion dropdowns, sync field mappings, and sync payloads.
Why PII Masking?
As your CDP processes real customer data — email addresses, phone numbers, physical addresses, SSNs — you need controls to ensure:
- Compliance — GDPR, CCPA, and other regulations require limiting exposure of personal data
- Least privilege — Operators configuring audiences and syncs don’t need to see raw PII
- Defense in depth — Multiple enforcement layers prevent accidental data leakage
Sensitivity Levels
Each column can be assigned one of four sensitivity levels:
| Level | Description | Previews | Suggestions | Syncs |
|---|---|---|---|---|
| None | Default. No restrictions applied. | Visible | Available | Available |
| Redacted | Data is masked in the UI but available for syncing. Best for PII that needs to reach destinations (e.g., email for audience matching). | Masked (***) | Hidden | Available |
| Sync Only | Data is hidden everywhere in the UI but available for syncing. Best for data that operators don’t need to see at all. | Hidden | Hidden | Available |
| Blocked | Fully restricted — not visible anywhere and cannot be synced. Best for highly sensitive data like SSNs or credit card numbers. | Hidden | Hidden | Blocked |
When to Use Each Level
- Redacted — Email addresses, phone numbers, names. Operators can build audiences that filter on these columns (typing values manually), but they won’t see actual values in previews or autocomplete dropdowns.
- Sync Only — Internal IDs or tokens that must reach the destination but have no operational value in the SignalSmith UI.
- Blocked — Social Security numbers, credit card numbers, passwords. Data that should never leave the warehouse through SignalSmith.
Configuring Sensitivity
Entity Config Page
Navigate to Schema > [Model] > Configure to access the entity configuration page. Each column row includes:
- Sensitivity dropdown — Select None, Redacted, Sync Only, or Blocked
- PII Type dropdown — Appears when sensitivity is set. Select the data category (Email, Phone, Name, Address, SSN, DOB, Custom)
Changes are saved with the rest of the entity configuration.
Inline Column Config (Model Detail)
For SQL models, the model detail page includes an inline column configuration table with the same Sensitivity and PII Type controls. This allows you to configure sensitivity without navigating to the entity config page.
Automatic PII Detection
Click the “Detect PII” button on the entity config page to run automatic detection. SignalSmith matches column names against known PII patterns and suggests appropriate sensitivity levels:
| Column Name Pattern | Detected As | Default Sensitivity |
|---|---|---|
email, email_address | Redacted | |
phone, phone_number, mobile | Phone | Redacted |
first_name, last_name, full_name | Name | Redacted |
address, street, zip_code, city | Address | Redacted |
ssn, social_security | SSN | Blocked |
dob, date_of_birth | Date of Birth | Redacted |
credit_card, card_number, cvv | Custom | Blocked |
ip_address | Custom | Redacted |
Detection never overrides existing sensitivity settings — it only fills in columns that don’t have a sensitivity level set yet.
PII detection also runs automatically when a model is first created, so columns with recognizable PII names are flagged immediately.
How Masking Works
Previews
When you preview a model’s data, SignalSmith enforces sensitivity at the server level:
- Redacted columns — Values are replaced with
***REDACTED***in the preview response. The column header shows a shield icon. - Sync Only / Blocked columns — The column is completely removed from the preview results. You won’t see it in the preview table at all.
- No model context — During model creation (before sensitivity is configured), previews show raw data so you can verify your query is correct.
Value Suggestions
The audience builder and journey filter use autocomplete suggestions based on sample values from the warehouse. For sensitive columns:
- Warehouse queries are skipped — SignalSmith doesn’t even fetch sample values for sensitive columns, so no PII is stored in the database
- Existing suggestions are cleared — If suggestions were stored before sensitivity was set, they’re stripped from API responses
The column is still usable for filtering — users can type filter values manually. Only the autocomplete dropdown is suppressed.
Syncs
Sensitivity affects sync field mapping and execution:
- Blocked columns are hidden from the source column dropdown in the field mapping editor. You cannot map them to any destination field.
- Redacted and Sync Only columns are available for mapping. They appear in the dropdown with a shield icon indicating they contain sensitive data.
- At execution time, any field mapping that references a blocked column is silently removed as a safety net.
- Validation — Creating or updating a sync with a blocked column in the field mapping returns an error.
Hash-on-Sync
For columns with a PII type set (email, phone, name, etc.), the field mapping editor shows a “Hash PII (SHA256)” checkbox. When enabled:
- The column value is normalized (lowercased, trimmed, phone numbers formatted to E.164)
- The normalized value is SHA256-hashed
- The hashed value is sent to the destination instead of the raw value
This is commonly used for ad platform audience matching:
| PII Type | Normalization | Example |
|---|---|---|
| Lowercase + trim | John@Example.com → sha256("john@example.com") | |
| Phone | Strip non-digits, E.164 format | (555) 123-4567 → sha256("+15551234567") |
| Name | Lowercase + trim | John → sha256("john") |
| Address | Lowercase + trim | 123 Main St → sha256("123 main st") |
Destinations that already handle their own PII hashing (like CAPI integrations for Meta, Google, TikTok) are not double-hashed — the hash-on-sync is bypassed for these connectors.
Integration with Other Features
| Feature | Behavior |
|---|---|
| Audience Builder | Sensitive columns are usable for filtering. Autocomplete suggestions are suppressed, but users can type values manually. |
| Traits | Sensitive columns can be referenced in trait definitions. Trait compilation generates SQL against the warehouse directly. |
| Identity Graphs | Sensitivity labels are visible. Blocked columns cannot be selected as identifiers. |
| Agent Smith | The AI agent receives sensitivity metadata and avoids querying blocked columns. |
| MCP Queries | When a model_id is provided, query results are masked according to the model’s column sensitivity. |
API Reference
Detect PII
POST /api/v1/workspaces/{workspaceId}/models/{modelId}/detect-piiReturns the model’s column configuration with suggested sensitivity and PII type values based on column name patterns. Does not save — review suggestions before persisting via the entity config update endpoint.
Update Entity Config (with sensitivity)
PUT /api/v1/workspaces/{workspaceId}/models/{modelId}/entity-configInclude sensitivity and pii_type fields in each column_config entry:
{
"entity_type": "parent",
"primary_key_column": "user_id",
"column_config": [
{
"name": "email",
"alias": "Email Address",
"enabled": true,
"data_type": "text",
"sensitivity": "redacted",
"pii_type": "email"
},
{
"name": "ssn",
"alias": "SSN",
"enabled": true,
"data_type": "text",
"sensitivity": "blocked",
"pii_type": "ssn"
}
]
}Best Practices
- Set sensitivity early — Configure column sensitivity as part of model setup, before creating syncs. PII auto-detection helps by flagging likely PII columns on creation.
- Use Blocked sparingly — Only block columns that should never leave the warehouse (SSN, credit card). Most PII columns should be Redacted (visible for sync mapping but masked in previews).
- Enable Hash PII for ad platforms — When syncing email or phone to advertising destinations, enable hash-on-sync to send SHA256 hashes instead of raw values.
- Review auto-detection — Automatic PII detection is pattern-based. Review suggestions to catch false positives (e.g.,
email_campaign_nameflagged as email) and false negatives (custom PII columns with non-standard names). - Audit regularly — Periodically review column sensitivity settings across models, especially after adding new columns or creating new models.
Next Steps
- Configure RBAC for workspace access control
- Set up destination filters for data flow restrictions
- Create a sync with PII-safe field mappings