For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocumentationAPI ReferenceSelf HostModel CardsClient LibrariesIntegrationsDeveloper ToolsChangelog
DocumentationAPI ReferenceSelf HostModel CardsClient LibrariesIntegrationsDeveloper ToolsChangelog
  • API References
    • Authentication
    • Concurrency and Limits
    • WebSocket
  • Text to Speech
    • POSTSynthesize Speech
    • STREAMStream Speech (SSE)
    • WSSStream Speech (WebSocket)
    • POSTLightning v3.1 (endpoint will be deprecated)
    • POSTLightning v3.1 SSE (endpoint will be deprecated)
    • WSSLightning v3.1 WebSocket (endpoint will be deprecated)
    • POSTLightning v2 (Deprecated)
    • POSTLightning v2 SSE (Deprecated)
    • WSSLightning v2 WebSocket (Deprecated)
    • GETGet Voices
    • POSTCreate a Voice Clone
    • GETList Voice Clones
    • DELDelete a Voice Clone
    • POSTAdd Voice (Deprecated)
    • GETGet Cloned Voices (Deprecated)
    • GETGet Pronunciation Dictionaries
    • POSTCreate Pronunciation Dictionary
    • PUTUpdate Pronunciation Dictionary
    • DELDelete Pronunciation Dictionary
  • Speech to Text
    • POSTPulse (Pre-Recorded)
    • WSSPulse (Realtime)
  • LLM (Chat Completions)
    • POSTElectron — Chat Completions
LogoLogo
Voice AgentsModels
Voice AgentsModels
Text to Speech

Lightning v3.1 (endpoint will be deprecated)

||View as Markdown|
POST
https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech
POST
/waves/v1/lightning-v3.1/get_speech
1import requests
2
3url = "https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech"
4
5payload = {
6 "text": "Hey i am your a text to speech model",
7 "voice_id": "daniel",
8 "sample_rate": 44100,
9 "speed": 1,
10 "output_format": "mp3"
11}
12headers = {
13 "Accept": "audio/wav",
14 "Authorization": "Bearer <BearerAuth>",
15 "Content-Type": "application/json"
16}
17
18response = requests.post(url, json=payload, headers=headers)
19
20print(response.json())
<Warning>**Endpoint scheduled for retirement.** This URL will stop accepting requests **60 days from the Lightning v3.1 Pro launch (2026-05-15)** — i.e. on **2026-07-14**. The Lightning v3.1 model itself is current and stays. Migrate to [`POST /waves/v1/tts`](/waves/api-reference/api-reference/text-to-speech/synthesize-speech) and select Lightning v3.1 via the `model` body field (default).</Warning> Synthesize speech from text in a single request. The simplest way to get audio when you have the full text up front — pass `text` + `voice_id`, get back binary audio. ## When to use this - **Use this** for short utterances you can render before playback (notifications, prompts, batch jobs, audio file generation). - **Use the SSE streaming endpoint** when you want playback to start before the full audio is ready (long passages, latency-sensitive apps). - **Use the WebSocket endpoint** when text arrives incrementally (LLM token streams, live captioning). ## Key features - 44 kHz natural, expressive synthesis - Cloned voice IDs (`voice_*`) work — same param as catalog voices - 12 documented languages — see the model card for the full list - Output formats: `pcm`, `mp3`, `wav`, `ulaw`, `alaw` - Sample rates: 8 kHz – 44.1 kHz - Speed: 0.5× – 2× - Per-call pronunciation dictionaries via `pronunciation_dicts` ## Examples **cURL** ```bash curl -X POST "https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech" \ -H "Authorization: Bearer $SMALLEST_API_KEY" \ -H "Content-Type: application/json" \ -H "Accept: audio/wav" \ -d '{ "text": "Hello from Lightning v3.1.", "voice_id": "magnus", "sample_rate": 24000, "output_format": "wav" }' --output speech.wav ``` **Python** (`pip install smallestai>=4.4.0`) ```python from smallestai import SmallestAI client = SmallestAI(token="YOUR_API_KEY") with open("speech.wav", "wb") as f: for chunk in client.waves.synthesize_lightning_v3_1( text="Hello from Lightning v3.1.", voice_id="magnus", sample_rate=24000, output_format="wav", # Optional: cloned voice support # voice_id="voice_FlPKRWI7DX", # Optional: pin pronunciations for specific words # pronunciation_dicts=["<your dict id>"], ): f.write(chunk) ``` **JavaScript / TypeScript** (using `fetch`) ```typescript const res = await fetch("https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech", { method: "POST", headers: { Authorization: `Bearer ${process.env.SMALLEST_API_KEY}`, "Content-Type": "application/json", Accept: "audio/wav", }, body: JSON.stringify({ text: "Hello from Lightning v3.1.", voice_id: "magnus", sample_rate: 24000, output_format: "wav", }), }); const audio = Buffer.from(await res.arrayBuffer()); require("node:fs").writeFileSync("speech.wav", audio); ``` ## Common gotchas - **Set `Accept: audio/wav`.** Omitting it can return an empty or unplayable response. - **Cloned voices** (`voice_*` from `add_voice`) work on this endpoint and support `pronunciation_dicts`. - **`pronunciation_dicts` validates IDs at request time.** Passing an unknown ID returns `Invalid input data` — create the dict first via the pronunciation-dicts endpoint and save the returned `id`. - **Pronunciation matching is case-sensitive.** Add both `Synopsis` and `synopsis` if your text uses both casings. - **44.1 kHz output** is supported but most playback environments are happy with 24 kHz — drop the sample rate if bandwidth matters. - **JavaScript / TypeScript**: the official `smallestai` npm package predates Lightning v3.1, so call this endpoint with `fetch` or `axios` as shown above.
Was this page helpful?
Previous

Stream Speech (WebSocket)

Next

Lightning v3.1 SSE (endpoint will be deprecated)

Built with
Endpoint scheduled for retirement. This URL will stop accepting requests 60 days from the Lightning v3.1 Pro launch (2026-05-15) — i.e. on 2026-07-14. The Lightning v3.1 model itself is current and stays. Migrate to POST /waves/v1/tts and select Lightning v3.1 via the model body field (default).

Synthesize speech from text in a single request. The simplest way to get audio when you have the full text up front — pass text + voice_id, get back binary audio.

When to use this

  • Use this for short utterances you can render before playback (notifications, prompts, batch jobs, audio file generation).
  • Use the SSE streaming endpoint when you want playback to start before the full audio is ready (long passages, latency-sensitive apps).
  • Use the WebSocket endpoint when text arrives incrementally (LLM token streams, live captioning).

Key features

  • 44 kHz natural, expressive synthesis
  • Cloned voice IDs (voice_*) work — same param as catalog voices
  • 12 documented languages — see the model card for the full list
  • Output formats: pcm, mp3, wav, ulaw, alaw
  • Sample rates: 8 kHz – 44.1 kHz
  • Speed: 0.5× – 2×
  • Per-call pronunciation dictionaries via pronunciation_dicts

Examples

cURL

$curl -X POST "https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech" \
> -H "Authorization: Bearer $SMALLEST_API_KEY" \
> -H "Content-Type: application/json" \
> -H "Accept: audio/wav" \
> -d '{
> "text": "Hello from Lightning v3.1.",
> "voice_id": "magnus",
> "sample_rate": 24000,
> "output_format": "wav"
> }' --output speech.wav

Python (pip install smallestai>=4.4.0)

1from smallestai import SmallestAI
2
3client = SmallestAI(token="YOUR_API_KEY")
4
5with open("speech.wav", "wb") as f:
6 for chunk in client.waves.synthesize_lightning_v3_1(
7 text="Hello from Lightning v3.1.",
8 voice_id="magnus",
9 sample_rate=24000,
10 output_format="wav",
11 # Optional: cloned voice support
12 # voice_id="voice_FlPKRWI7DX",
13 # Optional: pin pronunciations for specific words
14 # pronunciation_dicts=["<your dict id>"],
15 ):
16 f.write(chunk)

JavaScript / TypeScript (using fetch)

1const res = await fetch("https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech", {
2 method: "POST",
3 headers: {
4 Authorization: `Bearer ${process.env.SMALLEST_API_KEY}`,
5 "Content-Type": "application/json",
6 Accept: "audio/wav",
7 },
8 body: JSON.stringify({
9 text: "Hello from Lightning v3.1.",
10 voice_id: "magnus",
11 sample_rate: 24000,
12 output_format: "wav",
13 }),
14});
15const audio = Buffer.from(await res.arrayBuffer());
16require("node:fs").writeFileSync("speech.wav", audio);

Common gotchas

  • Set Accept: audio/wav. Omitting it can return an empty or unplayable response.
  • Cloned voices (voice_* from add_voice) work on this endpoint and support pronunciation_dicts.
  • pronunciation_dicts validates IDs at request time. Passing an unknown ID returns Invalid input data — create the dict first via the pronunciation-dicts endpoint and save the returned id.
  • Pronunciation matching is case-sensitive. Add both Synopsis and synopsis if your text uses both casings.
  • 44.1 kHz output is supported but most playback environments are happy with 24 kHz — drop the sample rate if bandwidth matters.
  • JavaScript / TypeScript: the official smallestai npm package predates Lightning v3.1, so call this endpoint with fetch or axios as shown above.

Authentication

AuthorizationBearer

Header authentication of the form Bearer <token>

Headers

AcceptenumRequiredDefaults to audio/wav

Must be audio/wav to receive binary audio. Required for proper playback.

Allowed values:

Request

This endpoint expects an object.
textstringRequiredDefaults to Hey i am your a text to speech model
The text to convert to speech.
voice_idstringRequiredDefaults to daniel
The voice identifier to use for speech generation.
modelenumOptionalDefaults to lightning_v3.1

TTS model to route the request to.

  • lightning_v3.1 (default) — standard Lightning v3.1 pool.
  • lightning_v3.1_pro — Lightning v3.1 Pro pool with a curated voice catalog. See the Pro model card.

New integrations should use the unified /waves/v1/tts route instead of this endpoint, but the model field is supported here for backwards-compatible Pro opt-in.

Allowed values:
sample_rateenumOptionalDefaults to 44100
The sample rate for the generated audio.
Allowed values:
speeddoubleOptional0.5-2Defaults to 1
The speed of the generated speech.
languageenumOptionalDefaults to en

Language code for synthesis. Influences pronunciation, number/date normalization, and phoneme selection.

  • Indian: en, hi, mr (Marathi), kn (Kannada), ta (Tamil), bn (Bengali), gu (Gujarati), te (Telugu), ml (Malayalam), pa (Punjabi), or (Odia)
  • European: es (Spanish)
  • auto — auto-detect from input text (recommended for code-switching)
output_formatenumOptionalDefaults to pcm

Format of the returned audio. pcm is the lowest-latency option but requires a decoder to play; mp3 and wav are directly playable in browsers and most media players. The server default is pcm when the field is omitted — the API playground uses mp3 so the generated audio is directly playable.

Allowed values:
pronunciation_dictslist of stringsOptional
The IDs of the pronunciation dictionaries to use for speech generation.
session_idstringOptionalformat: "^[a-zA-Z0-9_\-.]+$"<=128 characters

Optional client-provided session identifier for correlation. Only alphanumeric characters, hyphens, underscores, and dots are allowed. Max 128 characters. Echoed back in response headers as X-External-Session-Id.

request_idstringOptionalformat: "^[a-zA-Z0-9_\-.]+$"<=128 characters

Optional client-provided request identifier for correlation. Only alphanumeric characters, hyphens, underscores, and dots are allowed. Max 128 characters. Echoed back in response headers as X-External-Request-Id.

Response headers

X-Session-Idstring

Internal session identifier (system-generated UUID).

X-Request-Idstring

Internal request identifier (system-generated UUID).

X-External-Session-Idstring

Echoed client-provided session_id (empty if not provided).

X-External-Request-Idstring

Echoed client-provided request_id (empty if not provided).

Response

Synthesized speech retrieved successfully.

Errors

400
Bad Request Error
401
Unauthorized Error
500
Internal Server Error