Threadless completions

Threadless completions use the OpenAI-compatible chat completions endpoint (POST /api/v1/chat/completions) without creating persistent chat threads. Each call is stateless: you send the full message history in the request and get a response, like the standard OpenAI API.

Use this for integrations, automation, and cases where you do not need server-side conversation storage.

How it works

Send POST /api/v1/chat/completions with a messages array and a model (model ID or AI agent ID).
The platform runs the same LLM path as regular chats.
The response is either streaming (SSE) or a full JSON body, depending on stream.

No Chat record is created. Conversation state comes only from your request body on each call.

Basic usage

Simple completion

curl -X POST /api/v1/chat/completions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "messages": [{ "role": "user", "content": "Hello" }]
  }'

Streaming

Set "stream": true for a Server-Sent Events stream:

curl -X POST /api/v1/chat/completions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "messages": [{ "role": "user", "content": "Hello" }],
    "stream": true
  }'

Tool calls (multi-turn)

Threadless completions support tool calling for multi-turn flows.

Step 1: Send a prompt with tool definitions.

{
  "model": "gpt-4.1",
  "messages": [{ "role": "user", "content": "What is the weather in NYC?" }],
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Get current weather",
      "parameters": { "type": "object", "properties": { "city": { "type": "string" } } }
    }
  }]
}

Step 2: The reply includes tool_calls. Run the tool on your side, then send the full history back:

{
  "model": "gpt-4.1",
  "messages": [
    { "role": "user", "content": "What is the weather in NYC?" },
    { "role": "assistant", "content": null, "tool_calls": [{ "id": "call_abc", "type": "function", "function": { "name": "get_weather", "arguments": "{\"city\":\"NYC\"}" } }] },
    { "role": "tool", "tool_call_id": "call_abc", "content": "72°F, sunny" }
  ],
  "tools": [{ "type": "function", "function": { "name": "get_weather", "description": "Get current weather", "parameters": { "type": "object", "properties": { "city": { "type": "string" } } } }]
}

The model then returns a final answer using the tool output.

Request format

{
  "model": "string — model ID or AI agent ID",
  "messages": [
    { "role": "user", "content": "string or content parts array" },
    { "role": "assistant", "content": "string", "tool_calls": [...] },
    { "role": "tool", "tool_call_id": "string", "content": "string" }
  ],
  "tools": [{ "type": "function", "function": { "name": "...", "description": "...", "parameters": {} } }],
  "stream": false,
  "parallel_tool_calls": true
}

Authorization: API key with CHATS:WRITE.

When to use threadless completions

Automation — Call the model without creating chat records.
One-off questions — Answers without new rows in the chat list.
Custom clients — You own history and storage on top of the stateless API.
Tool workflows — Multi-turn tool use with client-held state.

Differences from regular chats

Aspect	Threadless completions	Regular chats
Conversation persistence	No — stateless per request	Yes — persistent chat thread
Chat list visibility	Not shown in chat list	Visible in chat list
Chat features	Unavailable (pin, rename, search, export, and so on)	Full chat features
State management	Client manages conversation history	Server manages history
API compatibility	OpenAI-compatible format	Platform-specific format

Limitations

No persistent conversation history — Messages are not grouped under a chat. They cannot be listed or retrieved as a conversation.
No chat-level features — Features tied to conversations (pinning, naming, summarization, compression, search, export) do not apply.
Client-managed state — You must include the full conversation history in each request. The server does not retain state between calls.

How it works​

Basic usage​

Simple completion​

Streaming​

Tool calls (multi-turn)​

Request format​

When to use threadless completions​

Differences from regular chats​

Limitations​