Migrate from OpenAI | Smallest AI Docs

Electron speaks the OpenAI Chat Completions wire format. To migrate an existing OpenAI integration, change exactly two strings:

base_url → https://api.smallest.ai/waves/v1
api_key → your Smallest AI key
model → "electron"

That’s it. Streaming, tool calling, JSON mode, multi-turn — all work identically because Electron implements the same protocol.

Side-by-side: Python

1 import os
2 from openai import OpenAI
3 
4 client = OpenAI(
5     api_key=os.environ["OPENAI_API_KEY"],
6 )
7 
8 response = client.chat.completions.create(
9     model="gpt-4o",
10     messages=[
11         {"role": "user", "content": "Hello!"},
12     ],
13 )
14 
15 print(response.choices[0].message.content)

Side-by-side: JavaScript

1 import OpenAI from "openai";
2 
3 const client = new OpenAI({
4   apiKey: process.env.OPENAI_API_KEY,
5 });
6 
7 const response = await client.chat.completions.create({
8   model: "gpt-4o",
9   messages: [{ role: "user", content: "Hello!" }],
10 });
11 
12 console.log(response.choices[0].message.content);

What works identically

Messages — same system, user, assistant, tool role conventions.
Streaming — same SSE format, same [DONE] marker, same delta chunks. With stream_options.include_usage: true, you get the final usage chunk just like OpenAI.
Tool calling — same tools array, same tool_calls in response, same multi-turn pattern. (Electron adds a voice-friendly filler in content alongside tool_calls — see Tool Calling.)
Error envelope — {"error": {"message", "type", "details", "request_id"}}. (Electron uses details: [{code, message, path}] for validation errors where OpenAI uses param; otherwise the envelope shape is the same.)
OpenAI SDKs — Python, JavaScript, Go, Java, Ruby, etc. All work without code changes beyond the two strings above.

Differences to be aware of

	OpenAI	Electron
`n > 1`	Supported	Rejected — make multiple requests if you need multiple completions
`prompt_logprobs`	Supported on some models	Rejected
`content` on tool-call turns	`null`	Often a short natural-language filler (see Tool Calling)
Model IDs	`gpt-4o`, `gpt-4-turbo`, etc.	`electron` (only public model today)
Pricing	Per model, varies	Single Electron rate; contact your Smallest AI account manager
Prefix caching	Automatic on some models	Automatic — see `usage.prompt_tokens_details.cached_tokens`
Context window	Varies	32,768 tokens combined input + output
Rate limits	Per-tier OpenAI tiers	Standard: 10 RPM / 3 concurrent. Enterprise: 200 RPM / 20 concurrent. See Concurrency and Limits.

Common migration questions

Will my fine-tuned OpenAI model work?

No — Electron is a separate model. There is no migration path for OpenAI fine-tunes today. For most use cases, careful system-prompt engineering on Electron achieves comparable behavior; reach out to support if you want help replicating a specific fine-tune.

What about function-calling formats — does it work with my existing tool definitions?

Yes. Pass your existing tools array unchanged. Electron returns tool_calls in the same JSON shape as OpenAI. The only behavioral difference: Electron may also populate content with a short filler phrase on tool-call turns — see Tool Calling.

Does response_format: {type: 'json_object'} work?

Yes. JSON mode is supported and follows OpenAI’s semantics — instruct the model to produce JSON in the system or user message, set response_format, and the output will be JSON.

How do I get the equivalent of OpenAI's tier-1/2/3 rate limits?

Electron uses per-plan limits documented in Concurrency and Limits. Contact sales for higher limits if your workload needs more than 200 RPM / 20 concurrent.

Are there any prompts that work on OpenAI but don't work on Electron?

Electron is competitive with frontier models on most tasks, including in 70 languages with first-class Indic support. If you find a specific prompt that performs worse on Electron, please share it with support — we use real customer prompts to drive ongoing improvements.

Reverse compatibility

Some teams want to keep the option to fall back to OpenAI. Because Electron uses the same SDK, you can wire both clients with a simple factory and select per-request:

1 import os
2 from openai import OpenAI
3 
4 CLIENTS = {
5     "electron": OpenAI(
6         base_url="https://api.smallest.ai/waves/v1",
7         api_key=os.environ["SMALLEST_API_KEY"],
8     ),
9     "openai": OpenAI(
10         api_key=os.environ["OPENAI_API_KEY"],
11     ),
12 }
13 
14 def chat(provider: str, **kwargs):
15     return CLIENTS[provider].chat.completions.create(**kwargs)
16 
17 # usage:
18 chat("electron", model="electron", messages=[...])
19 chat("openai", model="gpt-4o", messages=[...])