Electron is Smallest AI’s in-house language model, optimized for voice agents and built as a drop-in replacement for the OpenAI Chat Completions API. Production-grade quality, sub-300 ms time-to-first-token, and a price point built for high-volume workloads.
Time-to-first-token tuned for real-time UX.
Combined input + output context.
First-class Indic support.
Drop-in replacement for /v1/chat/completions.
Same request/response shape as OpenAI Chat Completions. Use the official OpenAI SDKs by swapping base_url and api_key.
Standard Server-Sent Events. Optional final usage chunk for accurate billing on client disconnect.
Standard OpenAI tools API, with voice-agent-optimized filler-phrase behavior before tool calls.
Cached input tokens billed at $0.10 / 1M (75% off). Automatic — no flag needed.
Wide multilingual coverage, with particularly strong Indic-language performance.
response_format: {type: "json_object"} for structured output.
Electron is trained for voice-agent workloads — instruction following on system prompts, conversational style, and holding long multi-turn dialogues without drift. We benchmark it internally against frontier alternatives on these tasks. General-purpose academic benchmarks like MMLU and IFEval target a different objective and are not the right yardstick for a model whose job is to drive a phone call.
Prefix-cache pricing applies automatically — see Prefix Caching. Every response reports usage.prompt_tokens_details.cached_tokens so you can audit savings.
Both limits enforce strictly — over either cap returns HTTP 429. See Concurrency and Limits.
Electron is multilingual with strong out-of-the-box quality across the following 70 languages. Particularly strong on Indic languages including lower-resource ones.
English · Spanish · French · German · Italian · Portuguese · Dutch · Catalan
Hindi · Bengali · Tamil · Telugu · Marathi · Gujarati · Kannada · Malayalam · Punjabi · Odia · Urdu
Polish · Russian · Ukrainian · Belarusian · Czech · Slovak · Romanian · Hungarian · Bulgarian · Croatian · Serbian · Slovenian · Macedonian · Albanian
Estonian · Latvian · Lithuanian
Swedish · Norwegian · Danish · Finnish · Icelandic
Greek · Turkish
Arabic · Hebrew · Persian (Farsi) · Kurdish
Chinese (Simplified) · Chinese (Traditional) · Japanese · Korean · Mongolian
Vietnamese · Thai · Indonesian · Malay · Filipino · Burmese · Khmer · Lao
Nepali · Sinhala
Kazakh · Uzbek
Swahili · Amharic · Afrikaans · Yoruba · Hausa · Zulu
n > 1 not supported. Each request returns exactly one completion. Make multiple requests if you need multiple completions.prompt_logprobs not supported.400.See Chat Completions for full request/response reference and Supported Parameters for the passthrough table.
Electron is intended for voice-agent and conversational workloads. Customers building user-facing applications should layer their own content moderation, prompt-injection defenses, and PII handling appropriate to their domain. Electron does not currently apply content moderation server-side — outputs reflect the model’s training and the prompts you provide.
For voice-agent applications handling regulated content (financial, healthcare), use the standard pattern: keep PII out of prompts where practical, apply post-processing redaction on outputs, and use Smallest AI’s Pulse PII redaction features on the transcription side.