For AI agents: a documentation index is available at the root level at /llms.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
LogoLogo
DocumentationAPI ReferenceSelf HostModel CardsClient LibrariesIntegrationsDeveloper ToolsChangelog
DocumentationAPI ReferenceSelf HostModel CardsClient LibrariesIntegrationsDeveloper ToolsChangelog
    • General
    • Lightning v3.1
    • Pulse STT
    • Hydra
  • General
  • June 12, 2026
  • June 3, 2026
  • May 23, 2026
  • May 22, 2026
  • May 22, 2026
  • May 12, 2026
  • May 7, 2026
  • April 22, 2026
  • April 20, 2026
  • Lightning v3.1
  • June 15, 2026
  • June 5, 2026
  • June 1, 2026
  • May 19, 2026
  • May 19, 2026
  • May 15, 2026
  • May 14, 2026
  • May 8, 2026
  • May 2, 2026
  • May 2, 2026
  • Pulse STT
  • June 16, 2026
  • June 15, 2026
  • May 30, 2026
  • May 30, 2026
  • May 28, 2026
  • May 22, 2026
  • May 15, 2026
  • May 8, 2026
  • May 6, 2026
  • May 6, 2026
  • May 4, 2026
  • May 3, 2026
  • May 1, 2026
  • May 1, 2026
  • May 1, 2026
  • April 30, 2026
  • April 21, 2026
  • April 20, 2026
  • Hydra
  • May 20, 2026
On this page
  • June 15, 2026
  • What changed
  • Why this matters
  • Cleanup
  • June 1, 2026
  • What changed
  • Where it works
  • Voice + language support
  • Backward compatibility

Lightning v3.1

June 15, 2026
June 15, 2026

June 5, 2026
June 5, 2026

June 1, 2026
June 1, 2026

May 19, 2026
May 19, 2026

May 19, 2026
May 19, 2026

May 15, 2026
May 15, 2026

May 14, 2026
May 14, 2026

May 8, 2026
May 8, 2026

May 2, 2026
May 2, 2026

May 2, 2026
May 2, 2026
Built with
Voice AgentsModels
Voice AgentsModels

Lightning TTS WebSocket — documented ?timeout=N connection-timeout knob

The Lightning TTS WebSocket (Stream Speech (WebSocket)) API ref now documents the ?timeout=N query parameter that has been available on the endpoint all along but wasn’t surfaced in the developer-facing docs.

What changed

Purely a docs change — no protocol or wire-level change. The new section in the API ref clarifies:

  • The default idle timeout on WSS /waves/v1/tts/live is 60 seconds, not the 20 seconds the legacy “WebSocket Support for TTS” page used to claim.
  • Override the value with ?timeout=N on the connection URL (positive integer seconds, e.g. wss://api.smallest.ai/waves/v1/tts/live?timeout=120).
  • Custom values are honored verbatim, including small ones (?timeout=5 closes after 5 seconds of silence) and large ones (verified up to ?timeout=999).
  • The timeout resets on every message you send (binary audio in, JSON control in), so keep-alive traffic restarts the clock.

Why this matters

Voice agents with long human-thinking windows, agentic pipelines that round-trip to an LLM between TTS bursts, and any workflow with extended natural pauses now have a documented way to keep the WebSocket open past 60 seconds without resorting to dummy keep-alive frames.

Cleanup

The standalone /waves/api-reference/api-references/web-socket page (which previously held this info, with a stale 20-second default and a reference to the deprecated /waves/v1/lightning-v3.1/get_speech/stream URL) has been removed. A redirect from the old URL points at the new home.

Lightning v3.1 — auto language removed from docs and spec enums

The language: "auto" value is no longer documented or listed in the Lightning v3.1 or Lightning v3.1 Pro spec enums. Pass an explicit language code that matches the voice instead.

Why: code-switching guidance lives on the voice, not the request. Each voice in the catalog has a tags.language set returned by GET /waves/v1/lightning-v3.1/get_voices; pass a language the voice was trained on to get the pronunciation you expect. The auto value never actually drove language detection at the model level — it was a permissive enum value that resolved to the voice’s default behavior — so removing it from the contract is the honest move.

What changed:

  • Lightning v3.1 + Pro OpenAPI/AsyncAPI language enum no longer lists auto.
  • Default value flipped from "auto" to "en" across the four specs (tts-openapi, lightning-v3.1-openapi, tts-ws, lightning-v3.1-ws).
  • “Automatic Language Detection & Code-Switching” Tip removed from the v3.1 model card; the Auto-detect table row removed from Supported Languages.
  • Pro model card: Languages row simplified to English (en), Hindi (hi); the code-switching cell now points to tags.language rather than auto.
  • All guides (quickstart, how-to-tts, stream-tts, overview, eval script) drop auto from their language parameter rows.
  • Voice-cloning spec prose updated: the same recommendation (match the reference audio’s language to the TTS request’s language) stands on its own.

Migration: if you were sending language: "auto", replace it with the language code that matches your voice (en, hi, ta, etc. — see tags.language on the voice via GET /waves/v1/lightning-v3.1/get_voices). Sending auto was not driving language detection in the first place; switching to an explicit code makes the output predictable.

Lightning v3.1 — per-word timestamps on WebSocket streaming

Lightning v3.1 now exposes per-word timing events to WebSocket clients. Opt in with one flag — useful for captioning UIs, karaoke-style word highlighting, avatar lip-sync, and word-level analytics.

What changed

Two changes to a WebSocket request: add word_timestamps: true and handle the new status: "word_timestamp" frame.

1ws.send(JSON.stringify({
2 text: "I bought 3 cats for $100 on Dec 25th",
3 voice_id: "meher",
4 model: "lightning_v3.1_pro",
5 sample_rate: 44100,
6 output_format: "pcm",
7 word_timestamps: true, // ← ADDED
8}));
9
10ws.onmessage = (event) => {
11 const msg = JSON.parse(event.data);
12 switch (msg.status) {
13 case "chunk":
14 audioPlayer.push(Buffer.from(msg.data.audio, 'base64'));
15 break;
16 case "word_timestamp": // ← NEW CASE
17 const { id, word, start, end } = msg.data;
18 captionTrack.push({ id, word, startSec: start, endSec: end });
19 break;
20 case "complete":
21 audioPlayer.end();
22 break;
23 }
24};

word is the exact substring from the input text — un-normalized. "$100" stays "$100", "25th" stays "25th", "3" stays "3". Non-Latin scripts come back verbatim (e.g., Devanagari for Hindi).

start and end are floats in seconds, relative to the start of the audio stream. Frames interleave with chunk in audio-time order, then a single complete terminates the session.

Where it works

SurfaceWord timestamps
WSS /waves/v1/tts/live (unified)✅
WSS /waves/v1/lightning-v3.1/get_speech/stream (legacy, retiring 2026-07-14)✅
POST /waves/v1/tts (sync HTTP)❌ — flag accepted, silently ignored
POST /waves/v1/tts/live (HTTP SSE)❌ — same

Voice + language support

LanguageVoice familyWord events
English (en)Base-queue voices — meher, devansh, kartik, maithili, liam, avery✅
Hindi (hi)Base-queue voices (same list)✅
Marathi / Bengali / Gujarati / Punjabi / Odianorth-Indic family❌
Tamil / Telugu / Kannada / Malayalamsouth-Indic family❌

For unsupported voice families the flag is accepted — audio works normally, but no word_timestamp frames are emitted. Detect this client-side by counting received word events after complete arrives.

Backward compatibility

word_timestamps defaults to false. Clients that don’t set the flag see no behavior change — same audio chunks, same completion frame, no new event type to handle. Purely opt-in.

Migration: none — pure addition. Existing integrations keep working untouched.

→ Word-level timestamps on the Lightning v3.1 model card — full wire spec, JS example, support matrix.

Lightning v2 and Lightning Large endpoints retired — 410 Gone with v3.1 migration pointer

The underlying inference pools for Lightning v2 and Lightning Large have been retired. Calls to these endpoints now return a fast 410 Gone with a migration pointer to Lightning v3.1.

EndpointBeforeAfter
POST /waves/v1/lightning-v2/*5xx after timeout410 MODEL_DEPRECATED → use lightning-v3.1
POST /waves/v1/lightning-large/*5xx after timeout410 MODEL_DEPRECATED → use lightning-v3.1
POST /waves/v1/prof-voice-cloning/*5xx after timeout410 MODEL_DEPRECATED → use /waves/v1/voice-cloning (v3.1)
POST /waves/v1/voice-cloning (model=v2)served via lightning-large400 — Voice cloning for lightning-v2 is deprecated; use lightning-v3.1
POST /waves/v1/voice-cloning (no model)defaulted to v2 → dead queuedefaults to v3.1, served
POST /waves/v1/voice-cloning (model=v3.1)servedserved (no change)

Response shape on the deprecated endpoints:

1{
2 "status": "error",
3 "error_code": "MODEL_DEPRECATED",
4 "message": "This model is retired. Please migrate to lightning-v3.1 via /waves/v1/lightning-v3.1/get_speech.",
5 "recommended_endpoint": "/waves/v1/lightning-v3.1/get_speech"
6}

Migration: replace any lightning-v2 or lightning-large calls with the equivalent lightning-v3.1 endpoint. Voice cloning with no model parameter now routes to v3.1 automatically — no client change needed for that case.

Not affected (so v3.1 voice cloning keeps working): lightning-v3.1/get_speech, voice-cloning clone-creation with v3.1, and the unified /tts and /tts/live routes.

Lightning v3.1 Pro — 35 new voices added to the catalog

35 new voices have been added to the Lightning v3.1 Pro voice catalog. They’re available immediately to any org with Lightning v3.1 Pro access.

$curl -X POST "https://api.smallest.ai/waves/v1/tts" \
> -H "Authorization: Bearer $SMALLEST_API_KEY" \
> -H "Accept: audio/wav" \
> -d '{"text": "Hello.", "voice_id": "<new-voice-id>", "model": "lightning_v3.1_pro", "sample_rate": 24000}'

What’s in this batch:

  • All 35 voices are routed through the standard Lightning v3.1 pipeline.
  • They’re also auto-promoted to the multimodel registry, so they remain available if you switch your model parameter later.
  • The full updated voice list is available via the Get Voices endpoint — filter to model=lightning_v3.1_pro to see only the Pro catalog.

Migration: no action — additive change, existing voices unchanged.

Lightning v3.1 Pro — premium voice catalog across American, British, and Indian accents

Lightning v3.1 now has a Pro tier with 39 curated voices across American, British, and Indian accents (both Male and Female). The Pro pool runs on dedicated inference capacity, delivering the same TTFB as standard Lightning v3.1.

What’s in the catalog:

  • Indian — Female (8): Rhea, Zariya, Kareena, Mishka, Inaaya, Saira, Meher, Aarini
  • Indian — Male (5): Aviraj, Vyom, Zoravar, Reyansh, Ahan
  • British — Female (6): Cressida, Elowen, Ottilie, Seraphina, Tabitha, Arabella
  • British — Male (7): Benedict, Cormac, Everett, Finley, Rupert, Winston, Caspian
  • American — Female (7): Willow, Autumn, Skylar, Savannah, Kennedy, Reagan, Sierra
  • American — Male (6): Maverick, Brooks, Hunter, Colton, Wesley, Asher

Languages supported: Indian voices speak English and Hindi (with native code-switching when language="auto"). British and American voices speak English. See per-voice tags.language via GET /waves/v1/lightning-v3.1/get_voices.

How to use it:

  • Atoms voice agents: open the agent’s voice picker and select the new Pro filter chip, then pick any Pro voice. Atoms transparently routes to the Pro pool — no other configuration needed.
  • API (direct): use the unified POST /waves/v1/tts (sync), POST /waves/v1/tts/live (SSE), or WSS /waves/v1/tts/live (WebSocket) endpoints and pass "model": "lightning_v3.1_pro" in the request body alongside the chosen voice_id. The legacy /waves/v1/lightning-v3.1/* routes also accept the model field for backwards-compatible Pro opt-in.

Voice cloning: not available on Lightning v3.1 Pro. Voice clones continue to use Lightning v3.1 (standard) and the existing voice-cloning flow. There is no migration required.

For the full catalog, integration examples, and a Python WebSocket sample, see the Lightning v3.1 Pro model card.

Indic voices now produce clean audio regardless of the language field

Lightning v3.1 used to pick an inference pool from the request’s language field, which meant Indic voices (Aadya, Yuvan, Samarth, Nilesh, Arnab, Niharika, Gargi, and other voices whose latents are trained on north_indic / south_indic encoders) could be served from the wrong pool when called with language=en or language=hi — producing distorted or unintelligible audio.

Routing now derives from the voice itself, not the request language:

  • Voices tagged odia / bengali / punjabi / gujarati / marathi route to the north_indic inference pool
  • Voices tagged kannada / malayalam / telugu / tamil route to the south_indic pool
  • All other voices continue to route by language as before

No code change needed. If you had previously worked around this by hard-coding language to match the voice family, you can remove that workaround — the platform now picks the correct pool automatically.

Voice clones are unaffected — clones bypass this lookup since they aren’t in the public voice catalog.

Lightning v3.1 — language list corrected to 12 (voice catalog source of truth)

Correction. Earlier in the week we expanded the Lightning v3.1 documented language list to 22 codes plus auto, sourced from the server-side lightningV3_1Schema enum in waves-platform. Live testing showed those 22 codes are accepted by the schema but only 12 of them have voices in the catalog — the other 10 (de, fr, it, pl, nl, ru, sv, pt, ar, he) silently fall back to the voice’s default language when called. We were lying to users.

Source of truth is now the voice catalog, not the schema enum. The actually-supported set is:

CodeLanguageVoices
enEnglish176
hiHindi115
taTamil13
esSpanish11
knKannada10
mrMarathi9
teTelugu8
orOdia8
paPunjabi8
mlMalayalam6
guGujarati5
bnBengali4

Plus auto for automatic language detection and code-switching across the above set. 217 voices total.

What changed:

  • Lightning v3.1 OpenAPI + AsyncAPI language enum narrowed to 12 codes + auto.
  • Model card, getting-started/models, text-to-speech overview, api-references/lightning-v3.1, integrations (LiveKit, Vercel AI SDK, JellyPod) all updated.
  • Removed the bogus Beta rows for German, French, Italian, Polish, Dutch, Russian, Swedish, Portuguese, Arabic, Hebrew.
  • Voice count corrected from “169 voices total” to 217.

Pulse STT (separate model) is unaffected — Pulse genuinely supports its full European + Indic + Asian language set via the multi-eu, multi-indic, multi-asian regional aggregators.

Reproducible verification. A live probe (scripts/spec-live-tests/spec_enum_vs_voice_catalog.py) now compares the spec’s language enum against the live GET /lightning-v3.1/get_voices response and fails CI if they drift. This catches the schema-vs-reality gap on every spec PR going forward.

Legacy Lightning STT/TTS API reference orphans removed from docs

Several legacy API reference MDX files that had already been unlinked from the v4 API reference navigation have been removed from the docs.

STT (Speech-to-Text):

  • The legacy Lightning (Pre-Recorded) HTTP reference (POST /waves/v1/lightning/get_text) and its OpenAPI spec (fern/apis/waves/openapi/asr-openapi.yaml). The current STT pre-recorded surface is Pulse (POST /waves/v1/pulse/get_text) — see Pulse pre-recorded reference. The legacy MDX page was a verbatim copy of the Pulse one with the URL substituted, so no functionality is lost.
  • (PR #110 separately removes the matching Lightning ASR WebSocket reference and its AsyncAPI spec.)

TTS (Text-to-Speech):

  • The legacy lightning-large HTTP TTS, SSE, and WebSocket reference pages (lightning-large.mdx, lightning-large-stream.mdx, lightning-large-ws.mdx). These were already unlinked from v4 nav. The current TTS surface is lightning-v3.1 — the migration prose under Voice Cloning already documents the cutover.

Voice cloning impact: none. The current voice-cloning flow (POST /waves/v1/voice-cloning) is unchanged. The deprecated lightning-large endpoints that are still in use (add_voice, get_cloned_voices, DELETE /waves/v1/lightning-large) remain in the API reference under Voice Cloning with their existing (Deprecated) labels.

If your code calls https://api.smallest.ai/waves/v1/lightning/get_text for STT or https://api.smallest.ai/waves/v1/lightning-large/get_speech for TTS, switch to Pulse and Lightning v3.1 respectively.

→ Pulse pre-recorded reference → Lightning v3.1 reference

Lightning v2 marked as deprecated across the docs

Lightning v2 is a legacy model. New integrations should use Lightning v3.1. The Lightning v2 endpoints remain available for existing callers but are not recommended for new work, and the docs have been updated to reflect that:

  • API reference nav — the three Lightning v2 entries (POST /waves/v1/lightning-v2/get_speech, POST /waves/v1/lightning-v2/stream, WSS /waves/v1/lightning-v2/get_speech/stream) now carry a (Deprecated) suffix in their nav titles.
  • API reference pages — lightning-v2.mdx, lightning-v2-stream.mdx, and lightning-v2-ws.mdx (and their versions/v4.0.0 mirrors) now lead with a yellow Deprecated badge.
  • TTS overview (text-to-speech/overview.mdx) — the “Available Models” CardGroup is now Lightning v3.1 only, with a deprecation notice for v2. The “Supported Languages” comparison table is now v3.1-only.
  • Models index (getting-started/models.mdx) — Lightning v2 card removed from the TTS section. Model overview table reduced to Lightning v3.1.
  • Integrations — the LiveKit and Vercel AI SDK pages no longer recommend lightning-v2 as a fallback or list it alongside lightning-v3.1. The LiveKit page also drops the consistency, similarity, and enhancement parameter rows that were v2-only.

Unchanged (intentional):

  • The Voice Cloning page (voice-cloning/how-to-vc-api.mdx) still references lightning-v2 in its deprecation-error rows — that’s a factual API behavior callers will see if they pass model=lightning-v2 to the cloning endpoint, and is useful for the migration audience.
  • The on-prem Docker pages still mention a lightning-v2 container — that’s the on-prem service name, separate from public API guidance.
  • The historical “Introducing Lightning v2” announcement in the changelog stays intact as part of the project history.

If you’re calling any lightning-v2 endpoint, plan a migration to lightning-v3.1. The voice catalog is different — use GET /waves/v1/{model}/get_voices to enumerate.

→ Lightning v3.1 model card → Lightning TTS overview