Skip to main content

Message Sanitization

Message Sanitization detects and acts on sensitive information in user messages before they are sent to an AI model. It protects your organization from accidentally leaking PII, credentials, and other confidential data through chat interactions.

The feature is fully configurable per organization — admins can choose which entity types to detect, set confidence thresholds, define actions (allow, redact, or block), and maintain an allow list of safe terms.

note

Message Sanitization is not enabled by default. To enable it for your organization, contact the Devs.ai team.

How It Works

When you send a message in a chat, the system automatically scans it for sensitive information. Depending on what's detected and how your organization's policy is configured, one of three things happens:

  1. Allow — No sensitive data is detected (or everything is below thresholds), so the message passes through unchanged.
  2. Redact — Sensitive data is detected and the policy says to redact it. The message is sent to the AI with sensitive portions replaced by [REDACTED].
  3. Block — High-risk data is detected (e.g., private keys, API keys). The message is rejected entirely. You'll see an alert showing which entities were found and can choose to Use redacted version to resubmit with sensitive data removed.

For messages already in a chat, sanitization runs transparently before each AI model call. Any previously unsanitized content is redacted automatically.

Admin Configuration

Organization admins manage the sanitization policy from Settings → Message Sanitization.

Available Controls

  • Enable/disable the feature with a master toggle
  • Apply presets for quick configuration:
    • Strict — Blocks government IDs and secrets, redacts everything else. Low detection thresholds.
    • Balanced — Redacts most PII, blocks only high-risk secrets. Medium thresholds.
    • Internal-Friendly — Higher thresholds. Blocks only secrets, redacts PII only at high confidence.
  • Set a default action (Allow, Redact, or Block) for entities without specific rules
  • Configure per-entity rules with custom confidence thresholds and actions
  • Maintain an allow list of terms that should bypass detection (e.g., company names that might be flagged as person names)
  • Browse the entity catalog showing all 50+ supported entity types grouped by region
  • Test the policy using the built-in test console before saving

Testing Your Policy

Before saving changes, you can test your policy without affecting real conversations:

  1. Navigate to Settings → Message Sanitization.
  2. Configure your desired policy.
  3. In the Test Console, enter sample text (e.g., "My SSN is 123-45-6789 and email is john@example.com").
  4. Click Run Test to see which entities are detected, their confidence scores, and the resulting action (allow, redact, or block).

Supported Entity Types

The system detects 50+ entity types across these categories:

RegionEntity Types
GlobalEmail, Phone, Person Name, Organization, Location, Date/Time, URL, Credit Card, Crypto Wallet, IBAN, IP Address, MAC Address, Medical License
USASSN, Bank Number, Driver License, ITIN, Passport, Medicare ID, NPI
UKNHS Number, NINO
EuropeSpanish NIF/NIE, Italian Fiscal Code/Driver License/VAT/Passport/ID, Polish PESEL, Finnish Identity Code
Asia-PacificSingapore NRIC/UEN, Australian ABN/ACN/TFN/Medicare, Indian PAN/Aadhaar/Passport/GSTIN, Korean identifiers, Thai TNIN
SecretsAPI Keys, Access Tokens, Passwords, Private Keys

What Users See

Redacted Message

When a message is redacted, the sensitive parts are replaced with [REDACTED] before the AI sees them. The conversation continues normally — you may notice redacted placeholders if you review the message.

Blocked Message

When a message is blocked, you'll see a red alert with:

  • A "Message Blocked" heading
  • Badges showing which entity types were detected (e.g., PRIVATE_KEY)
  • A Use redacted version button that resubmits your message with the sensitive content removed

Limitations

  • Detection works best with English text. Other languages may have reduced entity coverage.
  • The allow list uses case-insensitive substring matching — an allow list entry of "john" would also allow "johnson" to pass through.
  • Only text content in messages is evaluated. Images are not scanned.
  • If the detection service is temporarily unavailable, the system continues to operate with basic pattern-based detection only (covering private keys, API keys, tokens, and passwords).
⌘J