LiveKit
This guide walks you through integrating Smallest AI TTS and STT into a LiveKit Agents voice pipeline. LiveKit Agents is an open-source Python framework for building production-grade, real-time voice AI agents over WebRTC.
The livekit-plugins-smallestai package provides two services:
smallestai.STT— real-time speech-to-text using the Pulse API, with streaming over WebSocket (~64ms TTFT) and batch transcription over HTTPsmallestai.TTS— ultra-low-latency text-to-speech using the Lightning API
Code Example
The full runnable example is in the Smallest AI cookbook:
LiveKit Voice Agent — Smallest AI TTS + STT
Setup
1. Create a Virtual Environment
Activate it:
- On Linux/Mac:
- On Windows:
2. Install Dependencies
livekit-plugins-smallestai is published on PyPI and includes both the STT and TTS services. livekit-plugins-silero provides the VAD used for turn detection, and livekit-plugins-openai provides the LLM.
3. Create a LiveKit Project
Sign in to LiveKit Cloud, create a new project, and copy your project credentials.
4. Create a .env File
Services
smallestai.STT
Real-time transcription using the Smallest AI Pulse API. Connects over WebSocket for streaming and supports batch transcription over HTTP.
The STT service connects to wss://api.smallest.ai/waves/v1/pulse/get_text for streaming and https://api.smallest.ai/waves/v1/pulse/get_text for batch. Interim and final transcripts are both supported. START_OF_SPEECH is inferred from the first non-empty transcript.
smallestai.TTS
Text-to-speech using the Smallest AI Lightning API. The plugin uses persistent WebSocket streaming backed by a connection pool for low-latency audio delivery.
Complete Agent Example
A minimal but production-ready voice agent using Smallest AI for both STT and TTS:
Running the Agent
The dev flag starts the agent worker in development mode. To interact with it, open the LiveKit Agents Playground and enter your LIVEKIT_URL, LIVEKIT_API_KEY, and LIVEKIT_API_SECRET. The agent will greet the user automatically on session start.
The pipeline is fully interruptible — if the user speaks while the bot is talking, audio stops immediately and the bot re-engages without any custom logic.
Notes
- Set
eou_timeout_ms=0(the default) when using LiveKit’s built-in turn detection. Setting it to a non-zero value adds server-side silence detection on top of LiveKit’s own logic, which increases end-of-turn latency. - Call
tts.prewarm()during worker startup to pre-warm the WebSocket connection pool and reduce first-audio latency on the initial request. - For issues or questions, open an issue in the cookbook repository or reach out on Discord.

