Errors & reconnection
Errors & reconnection
Errors arrive as JSON events with type: "error". The connection stays usable unless a close frame follows immediately.
Error frame
error.event_id (when present) is the event_id of the client frame that triggered the error — correlate it with what you sent.
Two fields to switch on: error.type and error.code
An error frame has both a type (broad category) and a code (specific reason). Match on error.code for actionable handling.
error.code reference
What’s NOT an error event
See WebSocket connection for the full close-code reference.
Reconnection strategy
Hydra sessions are stateful — when you reconnect, you get a new session_id and a fresh session.configure requirement. There’s no resume token.
A simple, correct strategy:
Two principles:
- Cap the backoff. Don’t let a server-full event escalate to an unbounded retry storm.
- Don’t retry HTTP 401. It’s a credential error — looping won’t fix it. Bubble it up.
Diagnosing in production
event_id correlation is the single most useful debugging tool. Always include a client-side event_id on outbound frames (UUIDs are fine):
When an error references your event_id, you know exactly which frame caused it.
Common gotchas
- Treating every
erroras fatal. Most aren’t. Only act onerrorif a close frame follows. - Reconnecting on
1000. Idle close is normal. Reconnect only if the user is still engaged. - No backoff cap. A burst of
server_fullerrors with no jitter is how you turn a transient outage into your own outage.
Next
- WebSocket connection — close codes and idle timeout
- Tool calling —
tool_response_timeoutdeep dive

