For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocumentationAPI ReferenceSelf HostModel CardsClient LibrariesIntegrationsDeveloper ToolsChangelog
DocumentationAPI ReferenceSelf HostModel CardsClient LibrariesIntegrationsDeveloper ToolsChangelog
  • Getting Started
    • Introduction
    • Models
    • Authentication
  • Text to Speech (Lightning)
    • Quickstart
    • Overview
    • Sync & Async
    • Streaming
    • Pronunciation Dictionaries
    • Voices & Languages
    • HTTP vs Streaming vs WebSockets
  • Speech to Text (Pulse)
    • Quickstart
    • Overview
  • LLM (Electron)
    • Quickstart
    • Overview
    • Chat Completions
    • Streaming
    • Tool / Function Calling
    • Prefix Caching
    • Supported Parameters
    • Migrate from OpenAI
    • Best Practices
  • Cookbooks
    • Speech to Text
    • Text to Speech
    • Voice Agent (Electron + Pulse + Lightning)
  • Voice Cloning
    • Instant Clone (UI)
    • Instant Clone (API)
    • Instant Clone (Python SDK)
    • Delete Cloned Voice
  • Best Practices
    • Voice Cloning Best Practices
    • TTS Best Practices
  • Troubleshooting
    • Error reference
LogoLogo
Voice AgentsModels
Voice AgentsModels
On this page
  • Text to Speech (TTS) Models
  • Speech to Text (STT) Models
  • LLM Models
  • Geo-location Based Routing
  • Model Overview (TTS)
  • Model Overview (LLM)
  • Model Overview (STT)
  • Pricing
Getting Started

Models

||View as Markdown|
Was this page helpful?
Previous

Introduction

Next

Authentication

Built with

Text to Speech (TTS) Models

Lightning v3.1 Pro

Latest Release Premium 44 kHz pool with improved naturalness and a curated voice catalog across American, British, and Indian accents. English + Hindi with code-switching. Same latency profile as standard Lightning v3.1; select via "model": "lightning_v3.1_pro" on the unified TTS routes.

Lightning v3.1

A 44 kHz model delivering natural, expressive, and realistic speech. Supports voice cloning with ultra-low latency. 12 languages plus auto-detect with mid-sentence code-switching.

Lightning v2 is deprecated. New integrations should use Lightning v3.1 or Lightning v3.1 Pro. The v2 endpoints remain available for existing callers but are not recommended for new work.

Speech to Text (STT) Models

Pulse STT

Low-latency speech recognition for real-time and pre-recorded transcription. Automatic language detection across 38 languages.

LLM Models

Electron

Smallest AI’s in-house language model. OpenAI-compatible chat completions, <300 ms TTFT, 70 languages with first-class Indic support, voice-agent-optimized tool calling, and automatic prefix caching with a 75% discount on cached input. Select via "model": "electron" on POST /waves/v1/chat/completions.

Click on a model name to view its detailed model card.

Geo-location Based Routing

Waves intelligently routes every request to the nearest server cluster to ensure the lowest possible latency for your applications. We currently operate server clusters in:

  • India (Mumbai)
  • USA (Oregon)

Our routing system automatically detects the client’s geographical location and connects them to the optimal server based on network proximity and latency. This process is fully automated, no manual configuration is required on your side.

Model Overview (TTS)

Model IDDescriptionLanguages Supported
lightning-v3.1-pro Latest Release44 kHz premium pool, improved naturalness, curated American / British / Indian voice catalog. Selected via "model": "lightning_v3.1_pro" on the unified TTS routes.English
Hindi
auto
lightning-v3.144 kHz model, natural expressive speech, ultra-low latency, supports voice cloning.English
Hindi
Marathi
Kannada
Tamil
Bengali
Gujarati
Telugu
Malayalam
Punjabi
Odia
Spanish
auto

Model Overview (LLM)

Model IDDescriptionLanguages Supported
electronOpenAI-compatible chat completions. Sub-300 ms TTFT, 32K context, automatic prefix caching, voice-agent-optimized tool calling. Endpoint: POST /waves/v1/chat/completions.70 languages across Western Europe, Indic, Central/Eastern Europe, Baltic, Nordic, Middle East, East Asia, Southeast Asia, South Asia, Central Asia, and Africa. See the Electron model card for the full list.

Model Overview (STT)

Model IDDescriptionLanguages Supported
pulseLow-latency speech-to-text model supporting automatic language detection and real-time transcription.Italian
Spanish
English
Portuguese
Hindi
German
French
Ukrainian
Russian
Kannada
Malayalam
Polish
Marathi
Gujarati
Czech
Slovak
Telugu
Oriya (Odia)
Dutch
Bengali
Latvian
Estonian
Romanian
Punjabi
Finnish
Swedish
Bulgarian
Tamil
Hungarian
Danish
Lithuanian
Maltese
Japanese
Korean
Chinese
Malay
Indonesian
Tagalog

Note: The API uses ISO 639-1 language codes - Set 1 (2-letter codes) to specify supported languages.

Pricing

Our pricing model is designed to be flexible and scalable, catering to different usage needs. For detailed pricing information, please visit our pricing page or contact our sales team at support@smallest.ai.