Streaming
Streaming TTS delivers audio chunks as they’re generated — playback starts immediately instead of waiting for the full file. First chunk arrives in ~100ms.
Streamed audio output:
WebSocket Streaming
Persistent connections for continuous, low-latency audio. Best for conversational AI and real-time apps.
Endpoint: wss://api.smallest.ai/waves/v1/lightning-v3.1/get_speech/stream
SSE Streaming
Server-Sent Events over HTTP — simpler to set up, no persistent connection needed.
Endpoint: POST https://api.smallest.ai/waves/v1/lightning-v3.1/stream
Streaming Text Input (SDK)
For real-time applications where text arrives incrementally (e.g., from an LLM), the SDK supports streaming text input:
WebSocket vs SSE
Use WebSocket when sending multiple TTS requests over time (conversations, voice bots). Use SSE for simple one-shot streaming where you don’t need a persistent connection.
Response Format
Each WebSocket/SSE message is JSON:
Audio chunk:
Stream complete:
Configuration Parameters
For concurrency limits and connection management, see Concurrency and Limits.
Full runnable source: streaming-python.py

