Agent SmithGuardrails

Guardrails

Guardrails define safety boundaries for AI operations in SignalSmith, ensuring the AI agent operates within approved limits.

Overview

Guardrails are configurable policies that control what the AI agent can do with and without human approval. They prevent accidental data loss, unauthorized activation, and other high-impact operations.

Policy Types

Require Approval

Operations matching this policy require human approval before execution. The agent will pause and wait for an admin to approve or deny the action.

Example: Require approval before deleting any audience or triggering a sync to a production destination.

Deny

Operations matching this policy are blocked entirely. The agent cannot perform them even with approval.

Example: Deny deletion of sources or destinations via the agent.

Rate Limit

Limits how frequently the agent can perform certain operations within a time window.

Example: Limit the agent to triggering at most 5 syncs per hour.

Configuring Policies

  1. Navigate to AI Policies in the sidebar
  2. Click Create Policy
  3. Select the policy type (require_approval, deny, rate_limit)
  4. Select the resource type (audiences, syncs, destinations, etc.)
  5. Select the operation (create, update, delete, trigger)
  6. Click Save

Default Policies

By default, SignalSmith includes conservative guardrails:

OperationDefault Policy
Delete any resourceRequire approval
Trigger sync to productionRequire approval
Modify destination credentialsDeny
Create/update audiencesAllowed
Read/list operationsAllowed

Approval Workflow

When an operation requires approval:

  1. The agent pauses and displays the pending action
  2. A notification appears for workspace admins
  3. An admin reviews the action details
  4. The admin clicks Approve or Deny
  5. If approved, the agent continues execution
  6. If denied, the agent acknowledges and suggests alternatives

Audit Log

All AI operations are logged in the AI Audit Log, accessible at AI Audit Log in the sidebar. Each entry includes:

  • The agent session and message
  • The tool called and parameters
  • Whether approval was required
  • The approval decision (if applicable)
  • The result of the operation