Threadless Completions
Threadless completions let you use the OpenAI-compatible chat completions endpoint (POST /api/v1/chat/completions) without creating persistent conversation threads. Each request is stateless — you send the full conversation history in the request and receive a response, just like the standard OpenAI API.
This is ideal for programmatic integrations, automation workflows, and any use case where you don't need persistent conversation threads.
How It Works
- Send a
POST /api/v1/chat/completionsrequest with amessagesarray and amodel(model ID or AI agent ID). - The platform processes the request using the same LLM pipeline as regular chats.
- The response is returned as either a streaming SSE response or a collected JSON response, depending on the
streamflag.
No Chat object is created in the system. The entire conversation state is reconstructed from your request payload on every call.
Basic Usage
Simple Completion
curl -X POST /api/v1/chat/completions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4.1",
"messages": [{ "role": "user", "content": "Hello" }]
}'
Streaming
Set "stream": true to receive the response as Server-Sent Events:
curl -X POST /api/v1/chat/completions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4.1",
"messages": [{ "role": "user", "content": "Hello" }],
"stream": true
}'
Tool Calls (Multi-Turn)
Threadless completions support tool calling for multi-turn interactions.
Step 1: Send a prompt with tool definitions.
{
"model": "gpt-4.1",
"messages": [{ "role": "user", "content": "What is the weather in NYC?" }],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather",
"parameters": { "type": "object", "properties": { "city": { "type": "string" } } }
}
}]
}
Step 2: The response includes tool_calls. Execute the tool locally, then send the full history back:
{
"model": "gpt-4.1",
"messages": [
{ "role": "user", "content": "What is the weather in NYC?" },
{ "role": "assistant", "content": null, "tool_calls": [{ "id": "call_abc", "type": "function", "function": { "name": "get_weather", "arguments": "{\"city\":\"NYC\"}" } }] },
{ "role": "tool", "tool_call_id": "call_abc", "content": "72°F, sunny" }
],
"tools": [{ "type": "function", "function": { "name": "get_weather", "description": "Get current weather", "parameters": { "type": "object", "properties": { "city": { "type": "string" } } } }]
}
The AI produces a final response incorporating the tool output.
Request Format
{
"model": "string — model ID or AI agent ID",
"messages": [
{ "role": "user", "content": "string or content parts array" },
{ "role": "assistant", "content": "string", "tool_calls": [...] },
{ "role": "tool", "tool_call_id": "string", "content": "string" }
],
"tools": [{ "type": "function", "function": { "name": "...", "description": "...", "parameters": {} } }],
"stream": false,
"parallel_tool_calls": true
}
Authorization: Requires an API key with CHATS:WRITE permission.
When to Use Threadless Completions
- Automation pipelines — Process data through an AI model without creating chat history.
- One-off queries — Get a quick answer without cluttering the conversation list.
- Custom integrations — Build your own conversation management on top of the stateless API.
- Tool-calling workflows — Run multi-turn tool interactions where you manage state client-side.
Differences from Regular Chats
| Aspect | Threadless Completions | Regular Chats |
|---|---|---|
| Conversation persistence | No — stateless per request | Yes — persistent chat thread |
| Chat list visibility | Not shown in chat list | Visible in chat list |
| Chat features | Not available (pinning, naming, search, export) | Full feature set |
| State management | Client manages conversation history | Server manages history |
| API compatibility | OpenAI-compatible format | Platform-specific format |
Limitations
- No persistent conversation history — Messages are not grouped under a chat. They cannot be listed or retrieved as a conversation.
- No chat-level features — Features tied to conversations (pinning, naming, summarization, compression, search, export) do not apply.
- Client-managed state — You must include the full conversation history in each request. The server does not retain state between calls.