For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocumentationAPI ReferenceSelf HostModel CardsClient LibrariesIntegrationsDeveloper ToolsChangelog
DocumentationAPI ReferenceSelf HostModel CardsClient LibrariesIntegrationsDeveloper ToolsChangelog
  • Getting Started
    • Introduction
    • Models
    • Authentication
  • Text to Speech (Lightning)
    • Quickstart
    • Overview
    • Sync & Async
    • Streaming
    • Pronunciation Dictionaries
    • Voices & Languages
    • HTTP vs Streaming vs WebSockets
  • Speech to Text (Pulse)
    • Quickstart
    • Overview
  • LLM (Electron)
    • Quickstart
    • Overview
    • Chat Completions
    • Streaming
    • Tool / Function Calling
    • Prefix Caching
    • Supported Parameters
    • Migrate from OpenAI
    • Best Practices
  • Cookbooks
    • Speech to Text
    • Text to Speech
    • Voice Agent (Electron + Pulse + Lightning)
  • Voice Cloning
    • Instant Clone (UI)
    • Instant Clone (API)
    • Instant Clone (Python SDK)
    • Delete Cloned Voice
  • Best Practices
    • Voice Cloning Best Practices
    • TTS Best Practices
  • Troubleshooting
    • Error reference
LogoLogo
Voice AgentsModels
Voice AgentsModels
On this page
  • Side-by-side: Python
  • Side-by-side: JavaScript
  • What works identically
  • Differences to be aware of
  • Common migration questions
  • Reverse compatibility
LLM (Electron)

Migrate from OpenAI

||View as Markdown|
Was this page helpful?
Previous

Supported Parameters

Next

Best Practices

Built with

Electron speaks the OpenAI Chat Completions wire format. To migrate an existing OpenAI integration, change exactly two strings:

  1. base_url → https://api.smallest.ai/waves/v1
  2. api_key → your Smallest AI key
  3. model → "electron"

That’s it. Streaming, tool calling, JSON mode, multi-turn — all work identically because Electron implements the same protocol.

Side-by-side: Python

1import os
2from openai import OpenAI
3
4client = OpenAI(
5 api_key=os.environ["OPENAI_API_KEY"],
6)
7
8response = client.chat.completions.create(
9 model="gpt-4o",
10 messages=[
11 {"role": "user", "content": "Hello!"},
12 ],
13)
14
15print(response.choices[0].message.content)

Side-by-side: JavaScript

1import OpenAI from "openai";
2
3const client = new OpenAI({
4 apiKey: process.env.OPENAI_API_KEY,
5});
6
7const response = await client.chat.completions.create({
8 model: "gpt-4o",
9 messages: [{ role: "user", content: "Hello!" }],
10});
11
12console.log(response.choices[0].message.content);

What works identically

  • Messages — same system, user, assistant, tool role conventions.
  • Streaming — same SSE format, same [DONE] marker, same delta chunks. With stream_options.include_usage: true, you get the final usage chunk just like OpenAI.
  • Tool calling — same tools array, same tool_calls in response, same multi-turn pattern. (Electron adds a voice-friendly filler in content alongside tool_calls — see Tool Calling.)
  • Error envelope — {"error": {"message", "type", "details", "request_id"}}. (Electron uses details: [{code, message, path}] for validation errors where OpenAI uses param; otherwise the envelope shape is the same.)
  • OpenAI SDKs — Python, JavaScript, Go, Java, Ruby, etc. All work without code changes beyond the two strings above.

Differences to be aware of

OpenAIElectron
n > 1SupportedRejected — make multiple requests if you need multiple completions
prompt_logprobsSupported on some modelsRejected
content on tool-call turnsnullOften a short natural-language filler (see Tool Calling)
Model IDsgpt-4o, gpt-4-turbo, etc.electron (only public model today)
PricingPer model, variesFlat 0.40/0.40 / 0.40/0.10 / $1.60 per 1M input / cached / output
Prefix cachingAutomatic on some modelsAutomatic — see usage.prompt_tokens_details.cached_tokens
Context windowVaries32,768 tokens combined input + output
Rate limitsPer-tier OpenAI tiersStandard: 10 RPM / 3 concurrent. Enterprise: 200 RPM / 20 concurrent. See Concurrency and Limits.

Common migration questions

Will my fine-tuned OpenAI model work?

No — Electron is a separate model. There is no migration path for OpenAI fine-tunes today. For most use cases, careful system-prompt engineering on Electron achieves comparable behavior; reach out to support if you want help replicating a specific fine-tune.

What about function-calling formats — does it work with my existing tool definitions?

Yes. Pass your existing tools array unchanged. Electron returns tool_calls in the same JSON shape as OpenAI. The only behavioral difference: Electron may also populate content with a short filler phrase on tool-call turns — see Tool Calling.

Does response_format: {type: 'json_object'} work?

Yes. JSON mode is supported and follows OpenAI’s semantics — instruct the model to produce JSON in the system or user message, set response_format, and the output will be JSON.

How do I get the equivalent of OpenAI's tier-1/2/3 rate limits?

Electron uses per-plan limits documented in Concurrency and Limits. Contact sales for higher limits if your workload needs more than 200 RPM / 20 concurrent.

Are there any prompts that work on OpenAI but don't work on Electron?

Electron is competitive with frontier models on most tasks, including in 70 languages with first-class Indic support. If you find a specific prompt that performs worse on Electron, please share it with support — we use real customer prompts to drive ongoing improvements.

Reverse compatibility

Some teams want to keep the option to fall back to OpenAI. Because Electron uses the same SDK, you can wire both clients with a simple factory and select per-request:

1import os
2from openai import OpenAI
3
4CLIENTS = {
5 "electron": OpenAI(
6 base_url="https://api.smallest.ai/waves/v1",
7 api_key=os.environ["SMALLEST_API_KEY"],
8 ),
9 "openai": OpenAI(
10 api_key=os.environ["OPENAI_API_KEY"],
11 ),
12}
13
14def chat(provider: str, **kwargs):
15 return CLIENTS[provider].chat.completions.create(**kwargs)
16
17# usage:
18chat("electron", model="electron", messages=[...])
19chat("openai", model="gpt-4o", messages=[...])