For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
<Warning>**Endpoint scheduled for retirement.** This URL will stop accepting requests **60 days from the Lightning v3.1 Pro launch (2026-05-15)** — i.e. on **2026-07-14**. The Lightning v3.1 model itself is current and stays. Migrate to [`POST /waves/v1/tts`](/waves/api-reference/api-reference/text-to-speech/synthesize-speech) and select Lightning v3.1 via the `model` body field (default).</Warning>
Synthesize speech from text in a single request. The simplest way to get audio when you have the full text up front — pass `text` + `voice_id`, get back binary audio.
## When to use this
- **Use this** for short utterances you can render before playback (notifications, prompts, batch jobs, audio file generation).
- **Use the SSE streaming endpoint** when you want playback to start before the full audio is ready (long passages, latency-sensitive apps).
- **Use the WebSocket endpoint** when text arrives incrementally (LLM token streams, live captioning).
## Key features
- 44 kHz natural, expressive synthesis
- Cloned voice IDs (`voice_*`) work — same param as catalog voices
- 12 documented languages — see the model card for the full list
- Output formats: `pcm`, `mp3`, `wav`, `ulaw`, `alaw`
- Sample rates: 8 kHz – 44.1 kHz
- Speed: 0.5× – 2×
- Per-call pronunciation dictionaries via `pronunciation_dicts`
## Examples
**cURL**
```bash
curl -X POST "https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech" \
-H "Authorization: Bearer $SMALLEST_API_KEY" \
-H "Content-Type: application/json" \
-H "Accept: audio/wav" \
-d '{
"text": "Hello from Lightning v3.1.",
"voice_id": "magnus",
"sample_rate": 24000,
"output_format": "wav"
}' --output speech.wav
```
**Python** (`pip install smallestai>=4.4.0`)
```python
from smallestai import SmallestAI
client = SmallestAI(token="YOUR_API_KEY")
with open("speech.wav", "wb") as f:
for chunk in client.waves.synthesize_lightning_v3_1(
text="Hello from Lightning v3.1.",
voice_id="magnus",
sample_rate=24000,
output_format="wav",
# Optional: cloned voice support
# voice_id="voice_FlPKRWI7DX",
# Optional: pin pronunciations for specific words
# pronunciation_dicts=["<your dict id>"],
):
f.write(chunk)
```
**JavaScript / TypeScript** (using `fetch`)
```typescript
const res = await fetch("https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech", {
method: "POST",
headers: {
Authorization: `Bearer ${process.env.SMALLEST_API_KEY}`,
"Content-Type": "application/json",
Accept: "audio/wav",
},
body: JSON.stringify({
text: "Hello from Lightning v3.1.",
voice_id: "magnus",
sample_rate: 24000,
output_format: "wav",
}),
});
const audio = Buffer.from(await res.arrayBuffer());
require("node:fs").writeFileSync("speech.wav", audio);
```
## Common gotchas
- **Set `Accept: audio/wav`.** Omitting it can return an empty or unplayable response.
- **Cloned voices** (`voice_*` from `add_voice`) work on this endpoint and support `pronunciation_dicts`.
- **`pronunciation_dicts` validates IDs at request time.** Passing an unknown ID returns `Invalid input data` — create the dict first via the pronunciation-dicts endpoint and save the returned `id`.
- **Pronunciation matching is case-sensitive.** Add both `Synopsis` and `synopsis` if your text uses both casings.
- **44.1 kHz output** is supported but most playback environments are happy with 24 kHz — drop the sample rate if bandwidth matters.
- **JavaScript / TypeScript**: the official `smallestai` npm package predates Lightning v3.1, so call this endpoint with `fetch` or `axios` as shown above.
Authentication
AuthorizationBearer
Header authentication of the form Bearer <token>
Headers
AcceptenumRequiredDefaults to audio/wav
Must be audio/wav to receive binary audio. Required for proper playback.
Allowed values:
Request
This endpoint expects an object.
textstringRequiredDefaults to Hey i am your a text to speech model
The text to convert to speech.
voice_idstringRequiredDefaults to daniel
The voice identifier to use for speech generation.
modelenumOptionalDefaults to lightning_v3.1
TTS model to route the request to.
lightning_v3.1 (default) — standard Lightning v3.1 pool.
lightning_v3.1_pro — Lightning v3.1 Pro pool with a curated
voice catalog. See the
Pro model card.
New integrations should use the unified
/waves/v1/tts route instead of this endpoint, but the model
field is supported here for backwards-compatible Pro opt-in.
Allowed values:
sample_rateenumOptionalDefaults to 44100
The sample rate for the generated audio.
Allowed values:
speeddoubleOptional0.5-2Defaults to 1
The speed of the generated speech.
languageenumOptionalDefaults to en
Language code for synthesis. Influences pronunciation, number/date
normalization, and phoneme selection.
Indian:en, hi, mr (Marathi), kn (Kannada), ta (Tamil),
bn (Bengali), gu (Gujarati), te (Telugu), ml (Malayalam),
pa (Punjabi), or (Odia)
European:es (Spanish)
auto — auto-detect from input text (recommended for code-switching)
output_formatenumOptionalDefaults to pcm
Format of the returned audio. pcm is the lowest-latency option
but requires a decoder to play; mp3 and wav are directly
playable in browsers and most media players. The server default
is pcm when the field is omitted — the API playground uses
mp3 so the generated audio is directly playable.
Allowed values:
pronunciation_dictslist of stringsOptional
The IDs of the pronunciation dictionaries to use for speech generation.
Optional client-provided session identifier for correlation. Only alphanumeric characters, hyphens, underscores, and dots are allowed. Max 128 characters. Echoed back in response headers as X-External-Session-Id.
Optional client-provided request identifier for correlation. Only alphanumeric characters, hyphens, underscores, and dots are allowed. Max 128 characters. Echoed back in response headers as X-External-Request-Id.
Echoed client-provided session_id (empty if not provided).
X-External-Request-Idstring
Echoed client-provided request_id (empty if not provided).
Response
Synthesized speech retrieved successfully.
Errors
400
Bad Request Error
401
Unauthorized Error
500
Internal Server Error
Endpoint scheduled for retirement. This URL will stop accepting requests 60 days from the Lightning v3.1 Pro launch (2026-05-15) — i.e. on 2026-07-14. The Lightning v3.1 model itself is current and stays. Migrate to POST /waves/v1/tts and select Lightning v3.1 via the model body field (default).
Synthesize speech from text in a single request. The simplest way to get audio when you have the full text up front — pass text + voice_id, get back binary audio.
When to use this
Use this for short utterances you can render before playback (notifications, prompts, batch jobs, audio file generation).
Use the SSE streaming endpoint when you want playback to start before the full audio is ready (long passages, latency-sensitive apps).
Use the WebSocket endpoint when text arrives incrementally (LLM token streams, live captioning).
Key features
44 kHz natural, expressive synthesis
Cloned voice IDs (voice_*) work — same param as catalog voices
12 documented languages — see the model card for the full list
Output formats: pcm, mp3, wav, ulaw, alaw
Sample rates: 8 kHz – 44.1 kHz
Speed: 0.5× – 2×
Per-call pronunciation dictionaries via pronunciation_dicts
Examples
cURL
$
curl -X POST "https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech" \
>
-H "Authorization: Bearer $SMALLEST_API_KEY" \
>
-H "Content-Type: application/json" \
>
-H "Accept: audio/wav" \
>
-d '{
>
"text": "Hello from Lightning v3.1.",
>
"voice_id": "magnus",
>
"sample_rate": 24000,
>
"output_format": "wav"
>
}' --output speech.wav
Python (pip install smallestai>=4.4.0)
1
from smallestai import SmallestAI
2
3
client = SmallestAI(token="YOUR_API_KEY")
4
5
with open("speech.wav", "wb") as f:
6
for chunk in client.waves.synthesize_lightning_v3_1(
7
text="Hello from Lightning v3.1.",
8
voice_id="magnus",
9
sample_rate=24000,
10
output_format="wav",
11
# Optional: cloned voice support
12
# voice_id="voice_FlPKRWI7DX",
13
# Optional: pin pronunciations for specific words
14
# pronunciation_dicts=["<your dict id>"],
15
):
16
f.write(chunk)
JavaScript / TypeScript (using fetch)
1
const res = await fetch("https://api.smallest.ai/waves/v1/lightning-v3.1/get_speech", {
Set Accept: audio/wav. Omitting it can return an empty or unplayable response.
Cloned voices (voice_* from add_voice) work on this endpoint and support pronunciation_dicts.
pronunciation_dicts validates IDs at request time. Passing an unknown ID returns Invalid input data — create the dict first via the pronunciation-dicts endpoint and save the returned id.
Pronunciation matching is case-sensitive. Add both Synopsis and synopsis if your text uses both casings.
44.1 kHz output is supported but most playback environments are happy with 24 kHz — drop the sample rate if bandwidth matters.
JavaScript / TypeScript: the official smallestai npm package predates Lightning v3.1, so call this endpoint with fetch or axios as shown above.