Instant Voice Clone (REST API)
Instant Voice Clone (REST API)
Create an instant voice clone via a single POST call. Uploads the audio, runs preprocessing, and returns pre-generated sample clips of the cloned voice — all in one request.
This is the recommended way to clone voices programmatically. It replaces the older two-step flow and the model-specific /lightning-large/add_voice endpoint, which are both now deprecated.
Requirements
- A Smallest AI API key — grab one from the API Keys page.
- A clean audio sample, 5–15 seconds, under 5 MB. Supported types:
.mp3,.wav,.mp4,.webm.
Create the clone
Request fields
Response shape
Use data.voiceId directly in any TTS call:
List your clones
Returns every clone on your organization along with modelIds (compatible models), status, language, and creation time.
Check model compatibility before using a clone
The modelIds array in the list response tells you which TTS models a clone works with. Check it before passing voice_id to a TTS call:
Clones created via the current POST /waves/v1/voice-cloning endpoint always produce lightning-v3.1-compatible voices. The older POST /waves/v1/lightning-large/add_voice endpoint (deprecated) only produces lightning-large-compatible voices — those will not work if you try to use them with lightning-v3.1.
If you see a short (~100–200 byte) WAV response from a TTS request with a cloned voice, the most common cause is using a lightning-large-only clone on the lightning-v3.1 endpoint. Check modelIds on the clone.
Delete a clone
The current public delete endpoint lives at /waves/v1/lightning-large despite the path suggesting otherwise — it deletes any voice clone on your organization regardless of the underlying model.
Errors
Migrating from the legacy endpoint
If you’re currently using POST /waves/v1/lightning-large/add_voice, switch to POST /waves/v1/voice-cloning:
- Same auth —
Authorization: Bearer $SMALLEST_API_KEY. - Same multipart shape for
displayName+file. - New optional fields —
language,description,accent,tags. - Response includes
samples— pre-generated audio clips of the cloned voice, which the legacy endpoint did not provide. - Default model is
lightning-v3.1— the new unified TTS model. Voices cloned via the legacy endpoint only worked onlightning-largeand returned empty audio onlightning-v3.1.
The legacy endpoint still works but is marked deprecated in the API reference.
Need help? Email support@smallest.ai or ask on Discord.

