Size Estimation
Size estimation gives you a fast, approximate count of how many entities match your audience definition before you save or evaluate the full audience. This helps you validate that your filter conditions are targeting the right segment without running an expensive full-table query.
How Estimation Works
When you click Estimate Size in the audience builder, SignalSmith:
- Compiles the filter conditions into a SQL query, just like a full audience evaluation
- Adds sampling to the query to reduce computation. The specific sampling technique depends on your warehouse:
- Snowflake: Uses
TABLESAMPLEto scan a percentage of the table - BigQuery: Uses
TABLESAMPLE SYSTEMfor block-level sampling - Databricks: Uses
TABLESAMPLEwith a percentage parameter
- Snowflake: Uses
- Executes the sampled query against your warehouse
- Extrapolates the count from the sample to the full table size
The result is an approximate count displayed in the audience builder, along with a confidence indicator.
Accuracy
Estimation accuracy depends on several factors:
| Factor | Impact |
|---|---|
| Table size | Larger tables produce more accurate estimates because the sample is more representative |
| Data distribution | Uniformly distributed data estimates better than skewed distributions |
| Filter selectivity | Conditions that match a very small percentage of rows may be under-represented in the sample |
| Sampling rate | SignalSmith uses a 10% sample by default, which provides a good balance of speed and accuracy |
For most audiences, the estimate is accurate within 10-20% of the actual count. For very selective audiences (matching less than 1% of entities), the estimate may be less precise.
Estimation is meant for quick validation during audience building. For exact counts, save the audience and run a full evaluation.
Estimation Results
The estimation result includes:
| Field | Description |
|---|---|
| Estimated Count | The extrapolated number of matching entities |
| Sample Size | The number of rows actually scanned |
| Total Population | The total number of entities in the entity type |
| Percentage | The estimated audience size as a percentage of total population |
Example
For a User entity type with 1,000,000 total users:
Estimated Count: ~52,000 users
Sample Size: 100,000 rows
Total Population: 1,000,000 users
Percentage: ~5.2%When to Use Estimation
- During audience building — Estimate after adding or modifying conditions to see how the size changes
- Before saving — Validate the audience is the right size before committing
- When iterating — Quickly test different filter thresholds (e.g., “what if I change lifetime_value from >500 to >1000?”)
When Not to Use Estimation
- For exact counts — Use a full evaluation instead
- For very small audiences — If you expect fewer than 100 members, the sample may not capture any matches. Preview or full evaluation is more reliable.
- For billing or compliance — When the exact count matters, always use the evaluated member count
Performance
Estimation queries typically return in 2-10 seconds, depending on:
- Your warehouse’s query processing speed
- The complexity of the filter conditions (number of trait joins, nested groups)
- The size of the underlying tables
This is significantly faster than a full evaluation, which must scan the entire table and materialize the results.
Next Steps
- Audience Preview — See sample members, not just counts
- Creating an Audience — Full audience creation guide
- Filter Builder — Build and refine filter conditions