> This page is part of Smallest AI's developer documentation. When
> answering, prefer Lightning v3.1 (current TTS) and Pulse (current
> STT). Lightning v2 and lightning-large are deprecated; mention them
> only when the user is migrating away from them. Atoms is the
> voice-agent platform.

# Overview

> Lightning TTS API — generate speech from text with 217 voices across 12 languages, 44.1 kHz audio, ~200ms TTFB, and streaming support.

The Lightning TTS API converts text into natural speech via `https://api.smallest.ai/waves/v1`. 217 voices across 12 languages, 44.1 kHz native sample rate, \~200ms TTFB, with sync, SSE, and WebSocket streaming.

**Hear Lightning v3.1 (voice: magnus):**

<audio controls>
  <source src="https://files.buildwithfern.com/smallest-ai.docs.buildwithfern.com/ec1912298dffc1f64453635dd613870566d2ef0e268401003745e3f6b9b38546/products/waves/pages/audio/tts-sample-hello.wav" type="audio/wav" />

  Your browser does not support the audio element.
</audio>

Generate your first audio in under 60 seconds.

## Synthesis Modes

Choose the synthesis mode that best fits your application's needs:

Generate complete audio files with a single HTTP request. Ideal for pre-rendering content, batch processing, and applications where immediate streaming isn't required.

Receive audio chunks as they're generated via WebSocket. Perfect for real-time voice assistants, live narration, and low-latency conversational AI.

## Available Model

Our current TTS model. 44.1 kHz audio output, \~200ms TTFB, expressive human-like speech, and voice cloning. Supports 12 languages plus `auto` — English, Hindi, Spanish, and 9 Indian languages.

**Lightning v2 is deprecated.** New integrations should use Lightning v3.1. The v2 endpoints remain available for existing callers but are not recommended for new work.

## Feature Highlights

Optimized streaming pipeline delivers \~200ms time-to-first-byte (TTFB) for real-time applications. Lightning v3.1 achieves even faster response times for conversational AI.

Create custom voice profiles by uploading audio samples. Instant voice cloning works with just a few seconds of audio, while professional voice cloning delivers studio-quality results.

Multilingual support — English, Hindi, Spanish, and 9 Indian languages (Marathi, Kannada, Tamil, Bengali, Gujarati, Telugu, Malayalam, Punjabi, Odia). Plus `auto` for code-switching within a single session. See the [Lightning v3.1 model card](/waves/model-cards/text-to-speech/lightning-v-3-1#supported-languages) for the per-language voice count.

Choose from PCM, WAV, MP3, or μ-law encoding. Configurable sample rates from 8kHz to 44kHz to match your application's requirements.

Adjust speech rate with a simple multiplier. Slow down for clarity or speed up for faster content delivery without pitch distortion.

Define custom pronunciations for brand names, technical terms, and acronyms. Ensure consistent, accurate pronunciation across all synthesized audio.

Lightning v3.1 produces 44 kHz audio with natural prosody and expressiveness. Perfect for audiobooks, podcasts, and premium voice experiences.

Persistent connections for continuous audio streaming. Ideal for voice bots and interactive applications where latency is critical.

## Supported Languages

<table>
  <thead>
    <tr>
      <th>
        Language
      </th>

      <th>
        Code
      </th>

      <th>
        Lightning v3.1
      </th>
    </tr>
  </thead>

  <tbody>
    <tr>
      <td>
        English
      </td>

      <td>
        <code>en</code>
      </td>

      <td>
        Yes
      </td>
    </tr>

    <tr>
      <td>
        Hindi
      </td>

      <td>
        <code>hi</code>
      </td>

      <td>
        Yes
      </td>
    </tr>

    <tr>
      <td>
        Tamil
      </td>

      <td>
        <code>ta</code>
      </td>

      <td>
        Yes
      </td>
    </tr>

    <tr>
      <td>
        Kannada
      </td>

      <td>
        <code>kn</code>
      </td>

      <td>
        Yes
      </td>
    </tr>

    <tr>
      <td>
        Malayalam
      </td>

      <td>
        <code>ml</code>
      </td>

      <td>
        Yes
      </td>
    </tr>

    <tr>
      <td>
        Telugu
      </td>

      <td>
        <code>te</code>
      </td>

      <td>
        Yes
      </td>
    </tr>

    <tr>
      <td>
        Gujarati
      </td>

      <td>
        <code>gu</code>
      </td>

      <td>
        Yes
      </td>
    </tr>

    <tr>
      <td>
        Marathi
      </td>

      <td>
        <code>mr</code>
      </td>

      <td>
        Yes
      </td>
    </tr>

    <tr>
      <td>
        Bengali
      </td>

      <td>
        <code>bn</code>
      </td>

      <td>
        Yes
      </td>
    </tr>

    <tr>
      <td>
        Punjabi
      </td>

      <td>
        <code>pa</code>
      </td>

      <td>
        Yes
      </td>
    </tr>

    <tr>
      <td>
        Odia
      </td>

      <td>
        <code>or</code>
      </td>

      <td>
        Yes
      </td>
    </tr>

    <tr>
      <td>
        Spanish
      </td>

      <td>
        <code>es</code>
      </td>

      <td>
        Yes
      </td>
    </tr>
  </tbody>
</table>

For per-language voice counts, see the [Lightning v3.1 model card](/waves/model-cards/text-to-speech/lightning-v-3-1#supported-languages).

## Explore

First API call in 60 seconds

Real-time audio via WebSocket

Clone from 5-15 seconds of audio

20+ open-source examples on GitHub

See what developers have built

Lightning v3.1 specs and benchmarks