Threadless completions
Threadless completions use the OpenAI-compatible chat completions endpoint (POST /api/v1/chat/completions) without creating persistent chat threads. Each call is stateless: you send the full message history in the request and get a response, like the standard OpenAI API.
Use this for integrations, automation, and cases where you do not need server-side conversation storage.
How it works
- Send
POST /api/v1/chat/completionswith amessagesarray and amodel(model ID or AI agent ID). - The platform runs the same LLM path as regular chats.
- The response is either streaming (SSE) or a full JSON body, depending on
stream.
No Chat record is created. Conversation state comes only from your request body on each call.
Basic usage
Simple completion
curl -X POST /api/v1/chat/completions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4.1",
"messages": [{ "role": "user", "content": "Hello" }]
}'
Streaming
Set "stream": true for a Server-Sent Events stream:
curl -X POST /api/v1/chat/completions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4.1",
"messages": [{ "role": "user", "content": "Hello" }],
"stream": true
}'
Tool calls (multi-turn)
Threadless completions support tool calling for multi-turn flows.
Step 1: Send a prompt with tool definitions.
{
"model": "gpt-4.1",
"messages": [{ "role": "user", "content": "What is the weather in NYC?" }],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather",
"parameters": { "type": "object", "properties": { "city": { "type": "string" } } }
}
}]
}
Step 2: The reply includes tool_calls. Run the tool on your side, then send the full history back:
{
"model": "gpt-4.1",
"messages": [
{ "role": "user", "content": "What is the weather in NYC?" },
{ "role": "assistant", "content": null, "tool_calls": [{ "id": "call_abc", "type": "function", "function": { "name": "get_weather", "arguments": "{\"city\":\"NYC\"}" } }] },
{ "role": "tool", "tool_call_id": "call_abc", "content": "72°F, sunny" }
],
"tools": [{ "type": "function", "function": { "name": "get_weather", "description": "Get current weather", "parameters": { "type": "object", "properties": { "city": { "type": "string" } } } }]
}
The model then returns a final answer using the tool output.
Request format
{
"model": "string — model ID or AI agent ID",
"messages": [
{ "role": "user", "content": "string or content parts array" },
{ "role": "assistant", "content": "string", "tool_calls": [...] },
{ "role": "tool", "tool_call_id": "string", "content": "string" }
],
"tools": [{ "type": "function", "function": { "name": "...", "description": "...", "parameters": {} } }],
"stream": false,
"parallel_tool_calls": true
}
Authorization: API key with CHATS:WRITE.
When to use threadless completions
- Automation — Call the model without creating chat records.
- One-off questions — Answers without new rows in the chat list.
- Custom clients — You own history and storage on top of the stateless API.
- Tool workflows — Multi-turn tool use with client-held state.
Differences from regular chats
| Aspect | Threadless completions | Regular chats |
|---|---|---|
| Conversation persistence | No — stateless per request | Yes — persistent chat thread |
| Chat list visibility | Not shown in chat list | Visible in chat list |
| Chat features | Unavailable (pin, rename, search, export, and so on) | Full chat features |
| State management | Client manages conversation history | Server manages history |
| API compatibility | OpenAI-compatible format | Platform-specific format |
Limitations
- No persistent conversation history — Messages are not grouped under a chat. They cannot be listed or retrieved as a conversation.
- No chat-level features — Features tied to conversations (pinning, naming, summarization, compression, search, export) do not apply.
- Client-managed state — You must include the full conversation history in each request. The server does not retain state between calls.