Voice Agent (Electron + Pulse + Lightning)
Voice Agent (Electron + Pulse + Lightning)
Voice Agent (Electron + Pulse + Lightning)
This cookbook wires together all three Smallest AI products to build a working voice agent:
The same pattern underlies most production voice agents — customer support, sales calls, voice-driven UIs. Each piece is independently optimizable; this guide shows the minimum viable wiring.
If you want a full voice-agent platform with built-in telephony, campaigns, knowledge base, and call analytics, see Atoms — it’s built on top of this exact stack. Use this cookbook when you want to build the pipeline yourself.
Streaming STT. Audio chunks in → partial + final transcripts out. Supports 38 languages with auto-detection.
Chat completions + tool calling. Generates a filler phrase before tool calls so the user hears natural speech while tools run.
Streaming TTS. 44.1 kHz audio, ~200 ms TTFB, 12 TTS languages including Indic. See the model card for the full latency profile.
user message in your ongoing conversation. Stream the response.tool_calls: the filler in content is spoken via Lightning while you run the tool in parallel. When the tool returns, append the tool result and continue the conversation.This is a sketch — production code needs proper async coordination, jitter handling, and error recovery, but the wiring shape is real.
This is the part to internalize: Electron’s filler phrase + parallel tool execution.
When the user asks “What’s my account balance?”, this is what happens in milliseconds:
Without the filler-phrase pattern, the user would hear silence from 0 ms to ~1100 ms. With it, they hear natural speech start at ~600 ms — feels conversational instead of robotic.
stream: true for LLM, Lightning streaming for TTS. Any non-streaming hop adds hundreds of milliseconds.stream_options.include_usage: true on Electron so you bill accurately on disconnects.X-Request-Id from every Electron response for support traceability.language, Electron prompts, and Lightning voice language. A Hindi caller should be transcribed in Hindi, prompted in Hindi, and synthesized with a Hindi voice. Don’t translate in the middle of the pipeline — Electron and Lightning both handle Indic natively.