Message sanitization
Message sanitization detects and acts on sensitive information in user messages before they are sent to an AI model. It helps protect your organization from accidentally sharing PII, credentials, and other confidential data in chat.
Organization admins configure the feature per organization: which entity types to detect, confidence thresholds, actions (allow, redact, or block), and an allow list of safe terms.
Message sanitization is not enabled by default. To enable it for your organization, contact the Devs.ai team.
How it works
When you send a message in a chat, the system scans it for sensitive information. Depending on what is detected and how your organization's policy is set, one of three outcomes applies:
- Allow — No sensitive data is detected (or everything is below thresholds), so the message passes through unchanged.
- Redact — Sensitive data is detected and the policy requires redaction. The message is sent to the model with sensitive portions replaced by
[REDACTED]. - Block — High-risk data is detected (for example, private keys or API keys). The message is rejected. You see an alert listing which entities were found and can choose Use redacted version to resubmit with sensitive content removed.
For messages already in a thread, sanitization runs before each model call. Previously unsanitized content is redacted automatically.
Admin configuration
Organization admins manage the policy under Settings → Message Sanitization.
Available controls
- Enable or disable the feature with a master toggle
- Apply presets for quick configuration:
- Strict — Blocks government IDs and secrets; redacts everything else. Low detection thresholds.
- Balanced — Redacts most PII; blocks only high-risk secrets. Medium thresholds.
- Internal-Friendly — Higher thresholds; blocks only secrets; redacts PII only at high confidence.
- Set a default action (Allow, Redact, or Block) for entities without specific rules
- Configure per-entity rules with custom confidence thresholds and actions
- Maintain an allow list of terms that bypass detection (for example, company names that might be flagged as person names)
- Browse the entity catalog of 50+ supported entity types, grouped by region
- Test the policy in the built-in test console before saving
Testing your policy
You can test the policy without affecting real conversations:
- Go to Settings → Message Sanitization.
- Configure the policy you want.
- In the Test Console, enter sample text (for example, "My SSN is 123-45-6789 and email is john@example.com").
- Click Run Test to see detected entities, confidence scores, and the resulting action (allow, redact, or block).
Supported entity types
The system detects 50+ entity types across these categories:
| Region | Entity types |
|---|---|
| Global | Email, Phone, Person Name, Organization, Location, Date/Time, URL, Credit Card, Crypto Wallet, IBAN, IP Address, MAC Address, Medical License |
| USA | SSN, Bank Number, Driver License, ITIN, Passport, Medicare ID, NPI |
| UK | NHS Number, NINO |
| Europe | Spanish NIF/NIE, Italian Fiscal Code/Driver License/VAT/Passport/ID, Polish PESEL, Finnish Identity Code |
| Asia-Pacific | Singapore NRIC/UEN, Australian ABN/ACN/TFN/Medicare, Indian PAN/Aadhaar/Passport/GSTIN, Korean identifiers, Thai TNIN |
| Secrets | API Keys, Access Tokens, Passwords, Private Keys |
What users see
Redacted messages
When a message is redacted, sensitive segments are replaced with [REDACTED] before the model sees them. The conversation continues; you may see placeholders if you read the message back.
Blocked messages
When a message is blocked, you see a red alert with:
- A "Message Blocked" heading
- Badges for the entity types detected (for example,
PRIVATE_KEY) - A Use redacted version button that resubmits your message with sensitive content removed
Limitations
- Detection works best on English text. Other languages may have reduced entity coverage.
- The allow list uses case-insensitive substring matching — an entry like
johnwould also match insidejohnson. - Only text in messages is evaluated; images are not scanned.
- If the detection service is temporarily unavailable, the system falls back to basic pattern-based detection (private keys, API keys, tokens, and passwords).