Tool / Function Calling
Tool / Function Calling
Tool / Function Calling
Electron implements the standard OpenAI function-calling API. Define tools in the request, and the model returns tool_calls in the response when it decides to invoke one.
What makes Electron’s tool calling distinctive: with a voice-agent-style system prompt, the model speaks a short filler phrase before invoking a tool (e.g. “Let me check that for you…”), so a downstream TTS layer can mask tool-call latency by speaking the filler while the tool runs.
The filler is prompt-driven, not magic. With no system prompt, Electron returns the standard OpenAI shape (content: null when tool_calls is present). To get the filler reliably, give the model a system prompt that asks for it — see the voice-agent example below.
When Electron decides to call a tool, the assistant message returns both a conversational content filler and the structured tool_calls:
In strict OpenAI shape, content is null when tool_calls is set. With a voice-agent-style system prompt, Electron instead emits a short natural-language sentence — designed so a downstream voice agent can speak it while the tool resolves in the background.
The filler-phrase pattern is prompt-driven: when your system prompt asks for it, Electron emits the filler reliably. Without that hint, you’ll get the standard content: null shape. Always handle content defensively — either a string or null.
finish_reason will be "tool_calls" on this turn.
For voice agents, tool calls add visible latency — the user hears silence while you call a database, hit a webhook, etc. The standard mitigation is to play a “thinking” sound or filler phrase yourself. Electron handles this for you naturally.
Recommended pipeline:
delta.content tokens arrive, send them to your TTS engine in parallel — don’t wait for the tool call to complete.delta.tool_calls, kick off the actual tool execution in parallel with TTS.tool role message and continue the conversation.The user hears “Let me check the weather in Mumbai for you…” spoken naturally while your weather API call resolves in the background. End-to-end perceived latency drops by hundreds of milliseconds.
After the model returns tool_calls, run the tools and add tool role messages to the conversation, then make a follow-up call:
Electron can chain multiple tool calls within a conversation. The pattern interleaves:
Each filler is short and natural (“Let me also check…”, “One moment…”). Handle each tool_calls turn the same way: execute, append tool message, call again.
When stream: true and tools are involved, Electron streams in this order:
delta.content chunks arrive as the filler is generated.delta.tool_calls chunks follow, building up the function name and arguments incrementally.For chained calls, the pattern repeats: filler → tool_call → (your tool runs) → next filler → next tool_call → … → final response.
Tool-call deltas use the standard OpenAI streaming shape:
Concatenate the arguments deltas to reconstruct the final argument JSON.
tool_choice: "required" if you want to force the model to call exactly one tool.tool_choice: {"type":"function","function":{"name":"…"}} to force a specific tool.tools: [] — omit the field entirely if no tools are available this turn.