MCP Payload Compression
When your AI agent uses MCP (Model Context Protocol) tools that return large amounts of data — such as long lists of records, slides, or issues — the platform automatically compresses those responses to fit within the AI model's context window. This happens behind the scenes so conversations can continue smoothly even when tools return more data than the model can handle at once.
How It Works
When an MCP tool returns output, the system checks whether the response is within acceptable size limits. If the output is small enough, it passes through unchanged.
If the output is too large (typically over 120,000 tokens), the system:
- Identifies large arrays in the JSON response.
- Compresses each section using AI-based summarization, preserving the most important information.
- If the result is still too large after summarization, it trims arrays further to fit within the token budget.
The AI model is informed when compression has occurred, so it can decide whether it has enough information to answer your question or whether it should request more specific data.
What You'll Notice
In most cases, compression is invisible to you. The AI handles it automatically and adjusts its approach if needed. You might notice:
- The AI asking a more specific follow-up question after receiving compressed data (e.g., "I found 200 slides. Which ones would you like me to focus on?").
- Slightly summarized details in tool results when the original data was very large.
When Compression Happens
Compression is triggered only when:
- An MCP tool returns a JSON payload that exceeds the token budget.
- The payload contains arrays that can be chunked and summarized.
Small tool responses, non-JSON responses, and responses within the token budget are never modified.
Limitations
- Compression prioritizes fitting within the token budget over preserving every item in a large array. Some items may be trimmed.
- Compression is best-effort — in rare cases with very dense non-array content, the result may still be larger than ideal.
- The AI model is always informed when compression occurred, so it can request more specific data if needed.