Stream Speech (WebSocket)
Stream Speech (WebSocket)
Stream Speech (WebSocket)
Real-time text-to-speech over a persistent WebSocket connection. The
model field in the request payload selects which Lightning pool serves
the synthesis.
/waves/v1/tts/live (SSE) when you have the full text up
front but still want chunked playback. (Same URL, different
protocol — HTTP POST gets you SSE; WSS connect gets you WebSocket.)/waves/v1/tts (sync) when total latency doesn’t matter.Pass "model": "lightning_v3.1" (default) or
"model": "lightning_v3.1_pro" on each request. Concurrency and latency
are identical across both. Voice catalogs differ — see the
Lightning v3.1 and
Lightning v3.1 Pro
model cards for the per-model catalog.
/waves/v1/lightning-v3.1/get_speech/streamSame protocol, same payload shape — only the URL changes. Existing clients should:
wss://api.smallest.ai/waves/v1/tts/live."model": "lightning_v3.1_pro" to route to the Pro
pool. Omitting model keeps the existing standard-pool behavior.Voice IDs, sample rates, auth, and the response/streaming format are unchanged, so downstream audio handling, jitter buffers, and barge-in logic stay the same.
Header authentication of the form Bearer <token>
Bearer token for authentication. Format: Bearer YOUR_API_KEY
Send a JSON message with voice_id, text, and optional parameters (including model) to generate speech audio.