Create a Voice Clone

View as Markdown

Create an instant voice clone in a single call. Defaults to lightning-v3.1.

Authentication

AuthorizationBearer

Header authentication of the form Bearer <token>

Request

This endpoint expects a multipart form containing a file.
displayNamestringRequired1-500 characters

Human-readable name for the voice clone.

filefileRequired

Audio file to clone from. Supported MIME types: audio/mpeg, audio/mpeg-3, audio/wav, audio/wave, audio/webm, video/webm, audio/mp4, video/mp4. Maximum size: 5 MB.

descriptionstringOptional
Optional longer description for the voice clone.
accentstringOptional

Optional accent tag (e.g. “general”, “indian”).

tagsstringOptional

Optional comma-separated list of tags. Server splits on commas and trims whitespace ("en, tone-test"["en", "tone-test"]).

languagestringOptional
Primary language the clone will be used for. Optional, but **strongly recommended** — set it to the language of your reference audio. When a TTS request later uses `language: "auto"`, the server falls back to this value, so setting it now avoids silent language mismatches at inference time. Must be one of the languages supported by `lightning-v3.1` (e.g. `en`, `hi`, `multi`). The server validates and rejects unsupported codes with a 400.
modelenumOptionalDefaults to lightning-v3.1

Voice cloning model. Defaults to lightning-v3.1. lightning-v2 is accepted by the schema for historical reasons but is deprecated — the server returns 400 with "Voice cloning for lightning-v2 is deprecated. Please use lightning-v3.1".

Allowed values:

Response

Voice clone created. Includes pre-generated sample clips of the new voice.

messagestring
dataobject

Errors

400
Bad Request Error
401
Unauthorized Error
500
Internal Server Error