MCP payload compression

When your AI agent uses MCP (Model Context Protocol) tools that return large amounts of data — such as long lists of records, slides, or issues — the platform automatically compresses those responses so they fit within the model's context window. This runs in the background so conversations can continue when tools return more data than the model can take in at once.

How it works

When an MCP tool returns output, the system checks whether the response is within acceptable size limits. If the output is small enough, it passes through unchanged.

If the output is too large (typically over 120,000 tokens), the system:

Identifies large arrays in the JSON response.
Compresses each section using AI-based summarization, keeping the most important information.
If the result is still too large after summarization, trims arrays further to fit the token budget.

The model is told when compression has occurred, so it can judge whether it has enough to answer or should ask for more specific data.

What you'll notice

Usually you will not see compression directly — the model handles it and adjusts if needed. You might notice:

The model asking a more specific follow-up question after compressed data (for example, "I found 200 slides. Which ones would you like me to focus on?").
Summarized details in tool results when the original payload was very large.

When compression runs

Compression runs only when:

An MCP tool returns a JSON payload that exceeds the token budget.
The payload includes arrays that can be chunked and summarized.

Small tool responses, non-JSON responses, and responses within the token budget are not modified.

Limitations

Compression prioritizes staying within the token budget over keeping every entry in a large array. Some items may be trimmed.
Compression is best effort — in rare cases with very dense non-array content, the result may still be larger than ideal.
The model is always informed when compression occurred, so it can request more specific data if needed.

How it works​

What you'll notice​

When compression runs​

Limitations​

How it works

What you'll notice

When compression runs

Limitations