Message Sanitization
Message Sanitization detects and acts on sensitive information in user messages before they are sent to an AI model. It protects your organization from accidentally leaking PII, credentials, and other confidential data through chat interactions.
The feature is fully configurable per organization — admins can choose which entity types to detect, set confidence thresholds, define actions (allow, redact, or block), and maintain an allow list of safe terms.
Message Sanitization is not enabled by default. To enable it for your organization, contact the Devs.ai team.
How It Works
When you send a message in a chat, the system automatically scans it for sensitive information. Depending on what's detected and how your organization's policy is configured, one of three things happens:
- Allow — No sensitive data is detected (or everything is below thresholds), so the message passes through unchanged.
- Redact — Sensitive data is detected and the policy says to redact it. The message is sent to the AI with sensitive portions replaced by
[REDACTED]. - Block — High-risk data is detected (e.g., private keys, API keys). The message is rejected entirely. You'll see an alert showing which entities were found and can choose to Use redacted version to resubmit with sensitive data removed.
For messages already in a chat, sanitization runs transparently before each AI model call. Any previously unsanitized content is redacted automatically.
Admin Configuration
Organization admins manage the sanitization policy from Settings → Message Sanitization.
Available Controls
- Enable/disable the feature with a master toggle
- Apply presets for quick configuration:
- Strict — Blocks government IDs and secrets, redacts everything else. Low detection thresholds.
- Balanced — Redacts most PII, blocks only high-risk secrets. Medium thresholds.
- Internal-Friendly — Higher thresholds. Blocks only secrets, redacts PII only at high confidence.
- Set a default action (Allow, Redact, or Block) for entities without specific rules
- Configure per-entity rules with custom confidence thresholds and actions
- Maintain an allow list of terms that should bypass detection (e.g., company names that might be flagged as person names)
- Browse the entity catalog showing all 50+ supported entity types grouped by region
- Test the policy using the built-in test console before saving
Testing Your Policy
Before saving changes, you can test your policy without affecting real conversations:
- Navigate to Settings → Message Sanitization.
- Configure your desired policy.
- In the Test Console, enter sample text (e.g., "My SSN is 123-45-6789 and email is john@example.com").
- Click Run Test to see which entities are detected, their confidence scores, and the resulting action (allow, redact, or block).
Supported Entity Types
The system detects 50+ entity types across these categories:
| Region | Entity Types |
|---|---|
| Global | Email, Phone, Person Name, Organization, Location, Date/Time, URL, Credit Card, Crypto Wallet, IBAN, IP Address, MAC Address, Medical License |
| USA | SSN, Bank Number, Driver License, ITIN, Passport, Medicare ID, NPI |
| UK | NHS Number, NINO |
| Europe | Spanish NIF/NIE, Italian Fiscal Code/Driver License/VAT/Passport/ID, Polish PESEL, Finnish Identity Code |
| Asia-Pacific | Singapore NRIC/UEN, Australian ABN/ACN/TFN/Medicare, Indian PAN/Aadhaar/Passport/GSTIN, Korean identifiers, Thai TNIN |
| Secrets | API Keys, Access Tokens, Passwords, Private Keys |
What Users See
Redacted Message
When a message is redacted, the sensitive parts are replaced with [REDACTED] before the AI sees them. The conversation continues normally — you may notice redacted placeholders if you review the message.
Blocked Message
When a message is blocked, you'll see a red alert with:
- A "Message Blocked" heading
- Badges showing which entity types were detected (e.g.,
PRIVATE_KEY) - A Use redacted version button that resubmits your message with the sensitive content removed
Limitations
- Detection works best with English text. Other languages may have reduced entity coverage.
- The allow list uses case-insensitive substring matching — an allow list entry of "john" would also allow "johnson" to pass through.
- Only text content in messages is evaluated. Images are not scanned.
- If the detection service is temporarily unavailable, the system continues to operate with basic pattern-based detection only (covering private keys, API keys, tokens, and passwords).