*** title: Overview description: >- Convert text to natural-sounding speech with the Waves TTS API - ultra-low latency synthesis with voice cloning and streaming support icon: waveform-lines -------------------- The Waves Text-to-Speech (TTS) API converts text into natural, expressive audio via `https://waves-api.smallest.ai/api/v1`. With low latencies and support for 16+ languages, it's built for real-time applications like voice assistants, interactive bots, and live content generation. Get started in minutes. Learn how to get your API key and generate your first audio. ## Synthesis Modes Choose the synthesis mode that best fits your application's needs: Generate complete audio files with a single HTTP request. Ideal for pre-rendering content, batch processing, and applications where immediate streaming isn't required. Receive audio chunks as they're generated via WebSocket. Perfect for real-time voice assistants, live narration, and low-latency conversational AI. ## Available Models High-quality multilingual TTS with 100ms TTFB. Supports 16+ languages including English, Hindi, and European languages. Includes voice cloning support. Our most natural-sounding model with 44 kHz audio output. Ultra-low latency with expressive, human-like speech. Supports English, Hindi, Tamil, and Spanish with voice cloning. ## Feature Highlights Optimized streaming pipeline delivers sub-100ms time-to-first-byte (TTFB) for real-time applications. Lightning v3.1 achieves even faster response times for conversational AI. Create custom voice profiles by uploading audio samples. Instant voice cloning works with just a few seconds of audio, while professional voice cloning delivers studio-quality results. Comprehensive language support including English, Hindi, Tamil, Kannada, Malayalam, Telugu, Gujarati, Bengali, Marathi, German, French, Spanish, Italian, Polish, Dutch, and Russian. Choose from PCM, WAV, MP3, or μ-law encoding. Configurable sample rates from 8kHz to 44kHz to match your application's requirements. Adjust speech rate with a simple multiplier. Slow down for clarity or speed up for faster content delivery without pitch distortion. Define custom pronunciations for brand names, technical terms, and acronyms. Ensure consistent, accurate pronunciation across all synthesized audio. Lightning v3.1 produces 44 kHz audio with natural prosody and expressiveness. Perfect for audiobooks, podcasts, and premium voice experiences. Persistent connections for continuous audio streaming. Ideal for voice bots and interactive applications where latency is critical. ## Supported Languages
Language Code Lightning v2 Lightning v3.1
English en Yes Yes
Hindi hi Yes Yes
Tamil ta Yes Yes
Kannada kn Yes
Malayalam ml Yes
Telugu te Yes
Gujarati gu Yes
Bengali bn Yes
Marathi mr Yes
German de Yes
French fr Yes
Spanish es Yes Yes
Italian it Yes
Polish pl Yes
Dutch nl Yes
Russian ru Yes
## Next Steps * Generate your first audio in the [TTS Quickstart](/waves/documentation/text-to-speech/quickstart) * Learn synchronous and async synthesis in [How to use TTS](/waves/documentation/text-to-speech/how-to-tts) * Set up real-time streaming with [WebSocket TTS](/waves/documentation/text-to-speech/stream-tts) * Clone a voice with our [Voice Cloning guide](/waves/documentation/voice-cloning/how-to-voice-clone) * Add custom pronunciations with [Pronunciation Dictionaries](/waves/documentation/text-to-speech/pronunciation-dictionaries)