Electron — Chat Completions
Electron — Chat Completions
Electron — Chat Completions
Generate a chat completion with Electron. OpenAI-compatible
request/response shape — point any OpenAI SDK at
https://api.smallest.ai/waves/v1 and it just works.
Set stream: true to receive tokens via Server-Sent Events. With
stream_options: { include_usage: true }, the final SSE chunk
carries the usage block so token accounting is exact even on
client disconnects.
Tool calling follows OpenAI’s tools array convention. When you
provide a voice-agent-style system prompt, Electron emits a short
filler phrase in the assistant message content field alongside
tool_calls — see the Tool Calling guide
for the voice-agent pattern.
cURL
Python (pip install openai)
JavaScript / TypeScript (npm install openai)
Streaming with usage (Python)
/waves/v1, not /v1. The OpenAI SDK appends /chat/completions for you.stream_options.include_usage: true is required for exact token accounting on streaming calls — the final SSE chunk carries the usage block.n > 1 and prompt_logprobs are rejected. Use multiple requests if you need parallel completions.Authorization: Bearer $SMALLEST_API_KEY — get the key from the Smallest AI Console.Header authentication of the form Bearer <token>
Model ID. Currently only "electron".
Maximum output tokens. Combined input + output context ceiling is 32,768.
When true, response is text/event-stream. See the
Streaming guide.
Output shape. {type: "text"} (default) or {type: "json_object"}.
Best-effort determinism.
Opaque end-user identifier. Not interpreted by Electron.
Non-streaming: standard OpenAI chat.completion object.
Streaming (stream: true): text/event-stream SSE — each
event is a chat.completion.chunk delta, terminated by
data: [DONE].