WebSocket SDK
A JavaScript library that opens a live voice conversation with a Smallest agent from a web page. Published on npm as @smallest-ai/agent-sdk. Wraps the Realtime Agent WebSocket API, plus microphone capture, PCM playback, and an event API.
When to use it
- Voice widget or “click to talk” button on a marketing or product site.
- In-app voice input in a web dashboard.
- Agent demo, playground, or sandbox pages.
- Any browser experience where the user speaks to a Smallest agent and hears a reply.
When to use something else
You need an API key (from app.smallest.ai/dashboard/api-keys) and an agent ID. Create an agent from the Agents dashboard or follow the Developer Quickstart.
Install
Quickstart
The CDN equivalent uses the AtomsSdk global:
connect() calls navigator.mediaDevices.getUserMedia(). Browsers only expose the microphone on secure contexts: HTTPS in production, or http://localhost / http://127.0.0.1 in development. Serving the page from a non-loopback HTTP origin throws a permission error.
Configuration
Methods
Events
Subscribe with agent.on(eventName, handler).
Patterns
Push-to-talk
Connect normally, mute immediately, then toggle on button events.
Do not pass autoCaptureMic: false for push-to-talk. That mode skips microphone setup entirely and there is no public method to start the mic after connect(), so mute() and unmute() silently do nothing and isMuted always returns false. Use the default (autoCaptureMic: true) and call mute() right after connect() instead.
Agent-speaking indicator
Reflect agent turn state in the UI.
Clean teardown on page unload
Text input
Send a text message instead of speech. The reply still returns as audio.
Error handling
Errors arrive through three different paths. Handle each.
Handshake failure (bad API key, wrong agent ID, network issue, denied microphone permission) rejects the connect() promise with a generic Error("WebSocket connection failed"). The error event does not fire for these.
Mid-session server error arrives as an error event with { code, message }. These are sent by the server during an active session, not during the handshake.
Session termination fires session_ended with a reason string. Handle this to detect both graceful ends and abnormal closes.
Smoke-test the integration
A minimal standalone test to confirm the SDK reaches the agent and produces audio. No build step required.
Serve it from localhost and open with the key and agent ID in the query string:
Expected console output, in order: session_started with session_id and call_id, then agent_start_talking, then agent_stop_talking after the agent’s first turn completes. If the agent has a greeting, it plays through the default audio output.
Limitations
For non-browser runtimes, connect to the Realtime Agent WebSocket API directly. A raw-protocol guide with tested reference clients for Python, React Native, Swift, and Kotlin is in progress.

