Lightning v3.1 — language list corrected to 12 (voice catalog source of truth)

Correction. Earlier in the week we expanded the Lightning v3.1 documented language list to 22 codes plus auto, sourced from the server-side lightningV3_1Schema enum in waves-platform. Live testing showed those 22 codes are accepted by the schema but only 12 of them have voices in the catalog — the other 10 (de, fr, it, pl, nl, ru, sv, pt, ar, he) silently fall back to the voice’s default language when called. We were lying to users.

Source of truth is now the voice catalog, not the schema enum. The actually-supported set is:

CodeLanguageVoices
enEnglish176
hiHindi115
taTamil13
esSpanish11
knKannada10
mrMarathi9
teTelugu8
orOdia8
paPunjabi8
mlMalayalam6
guGujarati5
bnBengali4

Plus auto for automatic language detection and code-switching across the above set. 217 voices total.

What changed:

  • Lightning v3.1 OpenAPI + AsyncAPI language enum narrowed to 12 codes + auto.
  • Model card, getting-started/models, text-to-speech overview, api-references/lightning-v3.1, integrations (LiveKit, Vercel AI SDK, JellyPod) all updated.
  • Removed the bogus Beta rows for German, French, Italian, Polish, Dutch, Russian, Swedish, Portuguese, Arabic, Hebrew.
  • Voice count corrected from “169 voices total” to 217.

Pulse STT (separate model) is unaffected — Pulse genuinely supports its full European + Indic + Asian language set via the multi-eu, multi-indic, multi-asian regional aggregators.

Reproducible verification. A live probe (scripts/spec-live-tests/spec_enum_vs_voice_catalog.py) now compares the spec’s language enum against the live GET /lightning-v3.1/get_voices response and fails CI if they drift. This catches the schema-vs-reality gap on every spec PR going forward.