For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocumentationAPI ReferenceSelf HostModel CardsClient LibrariesIntegrationsDeveloper ToolsChangelog
DocumentationAPI ReferenceSelf HostModel CardsClient LibrariesIntegrationsDeveloper ToolsChangelog
  • API References
    • Authentication
    • Concurrency and Limits
    • WebSocket
  • Text to Speech
    • POSTSynthesize Speech
    • STREAMStream Speech (SSE)
    • WSSStream Speech (WebSocket)
    • POSTLightning v3.1 (endpoint will be deprecated)
    • POSTLightning v3.1 SSE (endpoint will be deprecated)
    • WSSLightning v3.1 WebSocket (endpoint will be deprecated)
    • POSTLightning v2 (Deprecated)
    • POSTLightning v2 SSE (Deprecated)
    • WSSLightning v2 WebSocket (Deprecated)
    • GETGet Voices
    • POSTCreate a Voice Clone
    • GETList Voice Clones
    • DELDelete a Voice Clone
    • POSTAdd Voice (Deprecated)
    • GETGet Cloned Voices (Deprecated)
    • GETGet Pronunciation Dictionaries
    • POSTCreate Pronunciation Dictionary
    • PUTUpdate Pronunciation Dictionary
    • DELDelete Pronunciation Dictionary
  • Speech to Text
    • POSTTranscribe (Pre-recorded)
    • WSSTranscribe (Realtime / WebSocket)
  • LLM (Chat Completions)
    • POSTElectron — Chat Completions
  • Speech to Speech
    • WSSHydra (Realtime / WebSocket)
LogoLogo
Voice AgentsModels
Voice AgentsModels
Text to Speech

Stream Speech (WebSocket)

||View as Markdown|
WSS
wss://api.smallest.ai/waves/v1/tts/live
Handshake
URLwss://api.smallest.ai/waves/v1/tts/live
MethodGET
Status101 Switching Protocols
Messages
# Live TTS WebSocket — `/waves/v1/tts/live` Real-time text-to-speech over a persistent WebSocket connection. The `model` field in the request payload selects which Lightning pool serves the synthesis. ## When to use this - **Use this** when text arrives incrementally (LLM token streams, live captioning, conversational pipelines where playback should start as soon as the first chunk is ready). - POST to `/waves/v1/tts/live` (SSE) when you have the full text up front but still want chunked playback. (Same URL, different protocol — HTTP POST gets you SSE; WSS connect gets you WebSocket.) - Use `/waves/v1/tts` (sync) when total latency doesn't matter. ## Selecting the model Pass `"model": "lightning_v3.1"` (default) or `"model": "lightning_v3.1_pro"` on each request. Concurrency and latency are identical across both. Voice catalogs differ — see the [Lightning v3.1](/waves/model-cards/text-to-speech/lightning-v-3-1) and [Lightning v3.1 Pro](/waves/model-cards/text-to-speech/lightning-v-3-1-pro) model cards for the per-model catalog. ## Migrating from `/waves/v1/lightning-v3.1/get_speech/stream` Same protocol, same payload shape — only the URL changes. Existing clients should: 1. Update the WebSocket URL to `wss://api.smallest.ai/waves/v1/tts/live`. 2. Optionally add `"model": "lightning_v3.1_pro"` to route to the Pro pool. Omitting `model` keeps the existing standard-pool behavior. Voice IDs, sample rates, auth, and the response/streaming format are unchanged, so downstream audio handling, jitter buffers, and barge-in logic stay the same.
Was this page helpful?
Previous

Stream Speech (SSE)

Next

Lightning v3.1 (endpoint will be deprecated)

Built with

Real-time text-to-speech over a persistent WebSocket connection. The model field in the request payload selects which Lightning pool serves the synthesis.

When to use this

  • Use this when text arrives incrementally (LLM token streams, live captioning, conversational pipelines where playback should start as soon as the first chunk is ready).
  • POST to /waves/v1/tts/live (SSE) when you have the full text up front but still want chunked playback. (Same URL, different protocol — HTTP POST gets you SSE; WSS connect gets you WebSocket.)
  • Use /waves/v1/tts (sync) when total latency doesn’t matter.

Selecting the model

Pass "model": "lightning_v3.1" (default) or "model": "lightning_v3.1_pro" on each request. Concurrency and latency are identical across both. Voice catalogs differ — see the Lightning v3.1 and Lightning v3.1 Pro model cards for the per-model catalog.

Migrating from /waves/v1/lightning-v3.1/get_speech/stream

Same protocol, same payload shape — only the URL changes. Existing clients should:

  1. Update the WebSocket URL to wss://api.smallest.ai/waves/v1/tts/live.
  2. Optionally add "model": "lightning_v3.1_pro" to route to the Pro pool. Omitting model keeps the existing standard-pool behavior.

Voice IDs, sample rates, auth, and the response/streaming format are unchanged, so downstream audio handling, jitter buffers, and barge-in logic stay the same.

Handshake

WSS
wss://api.smallest.ai/waves/v1/tts/live

Authentication

AuthorizationBearer

Header authentication of the form Bearer <token>

Headers

AuthorizationstringRequired

Bearer token for authentication. Format: Bearer YOUR_API_KEY

Send

TtsRequestobjectRequired

Send a JSON message with voice_id, text, and optional parameters (including model) to generate speech audio.

Receive

TtsResponseobjectRequired
Receive audio data chunks and completion status from the server.