Prompting voice agents
A working integration and a good-sounding voice agent are different problems. The integration is what the rest of these docs cover. This page is about the instructions string you send in session.configure.
The minimum useful prompt
Three things this gets right:
- Voice-first framing — “voice assistant”, not “AI”, not “chatbot”. Sets the persona toward spoken language.
- Length discipline — “one or two short sentences”. Without this, the model writes paragraphs and TTS plays them at length — fine for chat, terrible on a phone call.
- Warm tone — single-word style cue. The model carries it through prosody.
Don’t micromanage prosody in prose. Hydra adapts tone from context. Telling the model “speak slowly and carefully and pause between thoughts” mostly produces text that says “slowly and carefully and pause” rather than changing how it sounds. Shape the content and length; let Hydra handle delivery.
Anti-patterns
Turn-taking discipline
Hydra handles turn detection automatically, but the prompt still shapes how the model behaves around interruption.
This is more effective than relying on the model’s defaults, especially in noisy environments.
Tool use prompting
When you declare tools, also tell the model when to use them.
Without explicit instruction, the model sometimes answers from priors instead of calling the tool. Be direct.
Greetings and generate_initial_response
Pair generate_initial_response: true with an explicit opening-line instruction:
Without a specific instruction, the model picks a generic opener. With it, you get the line you want.
Length and pacing
Voice users tolerate roughly one breath of latency between asking and hearing an answer. The model can’t make itself talk faster, but you can make it say less.
For long-form content (legal disclaimers, addresses, phone numbers), break it explicitly:
Worked example: phone-banking concierge
Next
- Tool calling — the mechanics of declaring and executing tools
- Turn detection & barge-in — what the model handles vs what your prompt should influence

