SegmentAudiencesSize Estimation

Size Estimation

Size estimation gives you a fast, approximate count of how many entities match your audience definition before you save or evaluate the full audience. This helps you validate that your filter conditions are targeting the right segment without running an expensive full-table query.

How Estimation Works

When you click Estimate Size in the audience builder, SignalSmith:

  1. Compiles the filter conditions into a SQL query, just like a full audience evaluation
  2. Adds sampling to the query to reduce computation. The specific sampling technique depends on your warehouse:
    • Snowflake: Uses TABLESAMPLE to scan a percentage of the table
    • BigQuery: Uses TABLESAMPLE SYSTEM for block-level sampling
    • Databricks: Uses TABLESAMPLE with a percentage parameter
  3. Executes the sampled query against your warehouse
  4. Extrapolates the count from the sample to the full table size

The result is an approximate count displayed in the audience builder, along with a confidence indicator.

Accuracy

Estimation accuracy depends on several factors:

FactorImpact
Table sizeLarger tables produce more accurate estimates because the sample is more representative
Data distributionUniformly distributed data estimates better than skewed distributions
Filter selectivityConditions that match a very small percentage of rows may be under-represented in the sample
Sampling rateSignalSmith uses a 10% sample by default, which provides a good balance of speed and accuracy

For most audiences, the estimate is accurate within 10-20% of the actual count. For very selective audiences (matching less than 1% of entities), the estimate may be less precise.

Estimation is meant for quick validation during audience building. For exact counts, save the audience and run a full evaluation.

Estimation Results

The estimation result includes:

FieldDescription
Estimated CountThe extrapolated number of matching entities
Sample SizeThe number of rows actually scanned
Total PopulationThe total number of entities in the entity type
PercentageThe estimated audience size as a percentage of total population

Example

For a User entity type with 1,000,000 total users:

Estimated Count:  ~52,000 users
Sample Size:      100,000 rows
Total Population: 1,000,000 users
Percentage:       ~5.2%

When to Use Estimation

  • During audience building — Estimate after adding or modifying conditions to see how the size changes
  • Before saving — Validate the audience is the right size before committing
  • When iterating — Quickly test different filter thresholds (e.g., “what if I change lifetime_value from >500 to >1000?”)

When Not to Use Estimation

  • For exact counts — Use a full evaluation instead
  • For very small audiences — If you expect fewer than 100 members, the sample may not capture any matches. Preview or full evaluation is more reliable.
  • For billing or compliance — When the exact count matters, always use the evaluated member count

Performance

Estimation queries typically return in 2-10 seconds, depending on:

  • Your warehouse’s query processing speed
  • The complexity of the filter conditions (number of trait joins, nested groups)
  • The size of the underlying tables

This is significantly faster than a full evaluation, which must scan the entire table and materialize the results.

Next Steps