LiveKit
This guide walks you through integrating Smallest AI TTS and STT into a LiveKit Agents voice pipeline. LiveKit Agents is an open-source Python framework for building production-grade, real-time voice AI agents over WebRTC.
The livekit-plugins-smallestai package provides two services:
smallestai.STT— real-time speech-to-text using the Pulse API, with streaming over WebSocket (~64ms TTFT) and batch transcription over HTTPsmallestai.TTS— ultra-low-latency text-to-speech using the Lightning API
Code Example
The full runnable example is in the Smallest AI cookbook:
LiveKit Voice Agent — Smallest AI TTS + STT
Setup
1. Create a Virtual Environment
Activate it:
- On Linux/Mac:
- On Windows:
2. Install Dependencies
livekit-plugins-smallestai is published on PyPI and includes both the STT and TTS services. livekit-plugins-silero provides the VAD used for turn detection, and livekit-plugins-openai provides the LLM.
3. Create a LiveKit Project
Sign in to LiveKit Cloud, create a new project, and copy your project credentials.
4. Create a .env File
Services
smallestai.STT
Real-time transcription using the Smallest AI Pulse API. Connects over WebSocket for streaming and supports batch transcription over HTTP.
The STT service connects to wss://api.smallest.ai/waves/v1/pulse/get_text for streaming and https://api.smallest.ai/waves/v1/pulse/get_text for batch. Interim and final transcripts are both supported. START_OF_SPEECH is inferred from the first non-empty transcript.
smallestai.TTS
Text-to-speech using the Smallest AI Lightning API. Because the plugin synthesizes audio per request rather than streaming tokens, wrap it in tts.StreamAdapter with a SentenceTokenizer. The adapter splits LLM output at sentence boundaries and fires synthesis for each chunk, keeping first-audio latency low.
consistency, similarity, and enhancement apply only to "lightning-v2" and are ignored for "lightning-v3.1".
Complete Agent Example
A minimal but production-ready voice agent using Smallest AI for both STT and TTS:
Running the Agent
The dev flag starts the agent worker in development mode. To interact with it, open the LiveKit Agents Playground and enter your LIVEKIT_URL, LIVEKIT_API_KEY, and LIVEKIT_API_SECRET. The agent will greet the user automatically on session start.
The pipeline is fully interruptible — if the user speaks while the bot is talking, audio stops immediately and the bot re-engages without any custom logic.
Notes
- The
StreamAdapter+SentenceTokenizerwrapper is required for TTS — the Smallest AI plugin synthesizes audio per request. Without it, the agent waits for the entire LLM response before starting synthesis. - Set
eou_timeout_ms=0(the default) when using LiveKit’s built-in turn detection. Setting it to a non-zero value adds server-side silence detection on top of LiveKit’s own logic, which increases end-of-turn latency. "lightning-v3.1"is the recommended TTS model — it delivers ~100ms latency with 80+ voices. Switch to"lightning-v2"only if you need theconsistency/similarity/enhancementquality parameters.- For issues or questions, open an issue in the cookbook repository or reach out on Discord.

