Pipecat

View as Markdown

This guide walks you through integrating Smallest AI TTS and STT into a Pipecat voice pipeline. Pipecat is an open-source Python framework for building real-time voice and multimodal conversational AI agents using a frame-based architecture.

Code Example

The complete runnable example lives in the Pipecat repository:

Pipecat Example — Smallest AI TTS + STT

Setup

1. Create a Virtual Environment

$python3.11 -m venv .venv

Activate it:

  • On Linux/Mac:
    $source .venv/bin/activate
  • On Windows:
    $.venv\Scripts\activate

2. Install Pipecat with Smallest AI support

The smallest extra installs both the TTS and STT services for Smallest AI:

$pip install "pipecat-ai[smallest]"

To run the full voice agent example, you also need:

  • daily — Daily transport, which the bot uses to manage audio rooms and connect participants
  • openai — OpenAI LLM service for the language model
  • silero — Silero VAD for voice activity detection and interruption handling
  • runner — Pipecat development runner that creates Daily rooms automatically and serves the bot locally
$pip install "pipecat-ai[smallest,daily,openai,silero,runner]"

3. Create a .env file

$SMALLEST_API_KEY=...
$DAILY_API_KEY=...
$OPENAI_API_KEY=...

DAILY_API_KEY is required — the Pipecat runner creates a Daily room automatically at startup. If you want to reuse an existing room instead of creating a new one each run, set the optional DAILY_ROOM_URL variable.


Services

SmallestSTTService

1from pipecat.services.smallest.stt import SmallestSTTService
2from pipecat.transcriptions.language import Language
3
4stt = SmallestSTTService(
5 api_key=os.getenv("SMALLEST_API_KEY"),
6 settings=SmallestSTTService.Settings(
7 language=Language.EN,
8 ),
9)
ParameterTypeDefaultDescription
api_keystrYour Smallest AI API key (required)
languageLanguageLanguage.ENLanguage for transcription

The STT service connects to the Pulse real-time WebSocket endpoint (wss://api.smallest.ai/waves/v1/pulse/get_text) and streams audio frames from the pipeline, returning transcriptions with 64ms TTFT.


SmallestTTSService

1from pipecat.services.smallest.tts import SmallestTTSService
2
3tts = SmallestTTSService(
4 api_key=os.getenv("SMALLEST_API_KEY"),
5 settings=SmallestTTSService.Settings(
6 voice="sophia",
7 ),
8)
ParameterTypeDefaultDescription
api_keystrYour Smallest AI API key (required)
voicestrsophiaVoice ID for synthesis

The TTS service uses WebSocket streaming for low-latency, real-time audio delivery.


Running the Example

Clone the Pipecat repository and navigate to the examples directory:

$git clone https://github.com/pipecat-ai/pipecat.git
$cd pipecat/examples/voice

Create a .env file with the keys listed in the Setup section above, then run:

Daily transport — server mode (recommended):

$python voice-smallest.py -t daily

Open http://localhost:7860 in your browser. The runner creates a Daily room automatically and redirects you to it.

Daily transport — direct mode (no web server, for quick testing):

$python voice-smallest.py -d

The room URL is printed in the terminal. Open it in your browser to join.

The full source for voice-smallest.py is at examples/voice/voice-smallest.py. It sets up a complete interruptible voice bot using Smallest AI STT + TTS, OpenAI for the LLM, Silero VAD for interruptions, and Daily as the transport — all wired together with the Pipecat runner.


Notes

  • The pipeline is interruptible: if a user speaks while the bot is talking, audio stops immediately and the pipeline re-engages — no custom logic needed.
  • For any issues or questions, open an issue in the Pipecat repository or contact us on Discord.