Managing sessions
A Hydra session is the stateful interaction between the model and one connected client. One WebSocket = one session.
Lifecycle
The handshake is one-shot. After session.created, the server waits for exactly one session.configure before accepting audio. Subsequent session.configure frames are ignored — use session.update for mid-session changes.
session.configure
Send this once, immediately after session.created. Every field is optional.
session.configure silently accepts unknown fields — a typo like instuctions is ignored, not rejected, and the default persona ships instead. Validate keys client-side. session.update is stricter and returns an invalid_frame error on unknown fields.
session.configured (server echo)
Mid-session updates
Use session.update to live-patch the session without reconnecting. Only the tools field is honoured today. Persona, voice, and audio formats are frozen at handshake; changes to those require a fresh connection.
The server replies with session.updated containing only the fields it actually applied. A no-op patch produces no echo.
Bot speaks first
Setting generate_initial_response: true on session.configure makes Hydra deliver an opening line before any user audio arrives. Useful for greetings and concierge openers.
Immediately after session.configured, the standard response.created → audio deltas → response.done sequence fires, with no preceding input_audio_buffer.speech_started.
Conversation items
Most events carry a ConversationItem. The shape is intentionally flat — every field is optional, presence is dictated by type.
Discarded user turns — speech that VAD started but the turn detector later rejected — arrive as conversation.item.done with status: "incomplete". Silence and sub-VAD noise produce no events at all.
response.done
Every response ends with response.done:
Next
- Audio I/O — what to put in
input_audio_buffer.appendand how to playresponse.output_audio.delta - Turn detection & barge-in — how speech events fire and how to handle interruption on the client
- Tool calling — declare and execute functions during a session

