Agents APIAPI Reference
Chat API
REST endpoints for synchronous and streaming chat.
Chat API
The Chat API exposes two REST endpoints under the /chat router: one for synchronous replies and one for streaming (SSE).
POST /chat/run
Executes a single chat turn and returns the full response in one shot.
Authentication: JWT via Authorization: Bearer <token>.
Request body:
{
"message": "string",
"conversation_id": "string"
}Behavior:
- Accepts the user message and
conversation_id. - Loads the conversation from Redis cache (or creates a new one).
- Processes the message with the conversation/AI pipeline.
- Updates the conversation cache with both user and AI messages.
- Persists both messages to the database.
- Returns the full AI response.
Response:
{
"response": "string",
"conversation_id": "string"
}Notes:
- Synchronous: the client waits for the complete reply.
- Conversation state is kept in Redis.
- Messages are stored in the database for history.
POST /chat/stream
Executes a single chat turn and returns the AI reply as Server-Sent Events (SSE).
Authentication: JWT via Authorization: Bearer <token>.
Request body:
{
"message": "string",
"conversation_id": "string"
}Behavior:
- Accepts the user message and
conversation_id. - Loads or creates the conversation in Redis.
- Streams the AI reply incrementally via SSE.
- On completion (
is_final: true), updates the cache, saves messages to the database, and sends a final chunk (with optional metrics).
Response format (SSE):
data: {"content": "chunk of text", "is_final": false, "metrics": null}
data: {"content": "another chunk", "is_final": false, "metrics": null}
data: {"content": "last chunk", "is_final": true, "metrics": {...}}Response headers:
Content-Type: text/event-stream(or similar SSE type)Cache-Control: no-cacheConnection: keep-alive
Notes:
- Streaming improves perceived latency for long replies.
- The final event may include metrics (e.g. token count, timing).
- Conversation state and persistence behave like
/chat/run.