Pulse (Pre-Recorded) | Smallest AI Docs

Transcribe an audio file to text using the Pulse model. The fastest way to get a transcript when you already have a recording — pass either the raw bytes or a URL.

When to use this

Use this endpoint when you have a complete audio file (call recording, voicemail, podcast episode) and want the transcript back in one response. For live transcription as audio arrives, use the realtime WebSocket endpoint (WSS /waves/v1/pulse/get_text) instead.

Input methods

Send the audio in one of two ways:

Raw bytes — Content-Type: application/octet-stream with the audio in the body. All knobs (language, word_timestamps, etc.) are query parameters.
URL — Content-Type: application/json with {"url": "..."} in the body. Useful when the audio already lives in object storage. Same query parameters apply.

Pulse autodetects the language across 30+ supported locales. Pass language explicitly when you already know it — detection is fast but skipping it is faster.

Examples

cURL (raw bytes)

$ curl -X POST "https://api.smallest.ai/waves/v1/pulse/get_text?language=en&word_timestamps=true" \
>   -H "Authorization: Bearer $SMALLEST_API_KEY" \
>   -H "Content-Type: application/octet-stream" \
>   --data-binary "@./call.wav"

cURL (URL)

$ curl -X POST "https://api.smallest.ai/waves/v1/pulse/get_text?language=en" \
>   -H "Authorization: Bearer $SMALLEST_API_KEY" \
>   -H "Content-Type: application/json" \
>   -d '{"url": "https://your-bucket.s3.amazonaws.com/call.wav"}'

Python (pip install smallestai>=4.4.0)

1 from smallestai import SmallestAI
2 
3 client = SmallestAI(token="YOUR_API_KEY")
4 with open("./call.wav", "rb") as f:
5     result = client.waves.transcribe_pulse(
6         request=f.read(),
7         language="en",
8         word_timestamps=True,
9         diarize=True,
10     )
11 print(result.status)         # "success"
12 print(result.transcription)  # the transcript string

JavaScript / TypeScript (using fetch)

1 import { readFileSync } from "node:fs";
2 
3 const audio = readFileSync("./call.wav");
4 const params = new URLSearchParams({ language: "en", word_timestamps: "true", diarize: "true" });
5 
6 const res = await fetch(`https://api.smallest.ai/waves/v1/pulse/get_text?${params}`, {
7   method: "POST",
8   headers: {
9     Authorization: `Bearer ${process.env.SMALLEST_API_KEY}`,
10     "Content-Type": "application/octet-stream",
11   },
12   body: audio,
13 });
14 const result = await res.json();
15 console.log(result.transcription);

Common gotchas

Max file size is 25 MB. Larger files return HTTP 413. Compress to mono 16 kHz PCM if you’re close to the limit; quality is unaffected.
Formatting flags (format, punctuate, capitalize) are accepted at the wire level and exposed in the Python SDK as of smallestai>=4.4.0. Today they currently return the same transcript regardless of value — pass them in your integration so it works as the behavior changes.
Webhook-driven flow: pass webhook_url to receive the transcript asynchronously. The endpoint returns immediately; the transcript hits your webhook when ready. Useful for long files where you don’t want to hold an HTTP connection open.
Speaker diarization (diarize=true) adds latency. Skip it if you only need the words.
JavaScript / TypeScript: the official smallestai npm package predates the Pulse model, so call this endpoint with fetch or axios as shown above.

Transcribe an audio file to text using the Pulse model. The fastest way to get a transcript when you already have a recording — pass either the raw bytes or a URL. ## When to use this Use this endpoint when you have a complete audio file (call recording, voicemail, podcast episode) and want the transcript back in one response. For live transcription as audio arrives, use the realtime WebSocket endpoint (`WSS /waves/v1/pulse/get_text`) instead. ## Input methods Send the audio in one of two ways: 1. **Raw bytes** — `Content-Type: application/octet-stream` with the audio in the body. All knobs (`language`, `word_timestamps`, etc.) are query parameters. 2. **URL** — `Content-Type: application/json` with `{"url": "..."}` in the body. Useful when the audio already lives in object storage. Same query parameters apply. Pulse autodetects the language across 30+ supported locales. Pass `language` explicitly when you already know it — detection is fast but skipping it is faster. ## Examples **cURL** (raw bytes) ```bash curl -X POST "https://api.smallest.ai/waves/v1/pulse/get_text?language=en&word_timestamps=true" \ -H "Authorization: Bearer $SMALLEST_API_KEY" \ -H "Content-Type: application/octet-stream" \ --data-binary "@./call.wav" ``` **cURL** (URL) ```bash curl -X POST "https://api.smallest.ai/waves/v1/pulse/get_text?language=en" \ -H "Authorization: Bearer $SMALLEST_API_KEY" \ -H "Content-Type: application/json" \ -d '{"url": "https://your-bucket.s3.amazonaws.com/call.wav"}' ``` **Python** (`pip install smallestai>=4.4.0`) ```python from smallestai import SmallestAI client = SmallestAI(token="YOUR_API_KEY") with open("./call.wav", "rb") as f: result = client.waves.transcribe_pulse( request=f.read(), language="en", word_timestamps=True, diarize=True, ) print(result.status) # "success" print(result.transcription) # the transcript string ``` **JavaScript / TypeScript** (using `fetch`) ```typescript import { readFileSync } from "node:fs"; const audio = readFileSync("./call.wav"); const params = new URLSearchParams({ language: "en", word_timestamps: "true", diarize: "true" }); const res = await fetch(`https://api.smallest.ai/waves/v1/pulse/get_text?${params}`, { method: "POST", headers: { Authorization: `Bearer ${process.env.SMALLEST_API_KEY}`, "Content-Type": "application/octet-stream", }, body: audio, }); const result = await res.json(); console.log(result.transcription); ``` ## Common gotchas - **Max file size is 25 MB.** Larger files return HTTP `413`. Compress to mono 16 kHz PCM if you're close to the limit; quality is unaffected. - **Formatting flags (`format`, `punctuate`, `capitalize`)** are accepted at the wire level and exposed in the Python SDK as of `smallestai>=4.4.0`. Today they currently return the same transcript regardless of value — pass them in your integration so it works as the behavior changes. - **Webhook-driven flow**: pass `webhook_url` to receive the transcript asynchronously. The endpoint returns immediately; the transcript hits your webhook when ready. Useful for long files where you don't want to hold an HTTP connection open. - **Speaker diarization** (`diarize=true`) adds latency. Skip it if you only need the words. - **JavaScript / TypeScript**: the official `smallestai` npm package predates the Pulse model, so call this endpoint with `fetch` or `axios` as shown above.

Authentication

AuthorizationBearer

Header authentication of the form Bearer <token>

Query parameters

languageenumOptionalDefaults to multi-eu

Language of the audio file. Set explicitly to the known language for best accuracy.

Auto-detection scopes:

multi-eu (default) — European set: de, en, fr, it, nl, pt, ru, es.
multi-indic — Indic set: en, hi, mr, pa, gu, or, ka, ta, te, ml, bn.
multi-asian — East Asian set: en, ja, ko, zh, yue.
multi — full multilingual auto-detection across all supported languages.

Omitting language routes to multi-eu, which can mis-detect on non-European audio. Always pass language explicitly when the source language is known, or pick the regional multi-* scope that matches your audio.

Language of the audio file. Set explicitly to the known language for best accuracy. Auto-detection scopes: - `multi-eu` (default) — European set: de, en, fr, it, nl, pt, ru, es. - `multi-indic` — Indic set: en, hi, mr, pa, gu, or, ka, ta, te, ml, bn. - `multi-asian` — East Asian set: en, ja, ko, zh, yue. - `multi` — full multilingual auto-detection across all supported languages. Omitting `language` routes to `multi-eu`, which can mis-detect on non-European audio. Always pass `language` explicitly when the source language is known, or pick the regional `multi-*` scope that matches your audio.

encodingenumOptional

Audio encoding of the bytes you upload. Mirrors the encoding parameter on the realtime WS endpoint.

linear16, linear32 — raw PCM (16-bit and 32-bit)
alaw, mulaw — 8 kHz telephony codecs
opus, ogg_opus — Opus compressed audio (raw and Ogg container)

When omitted, the server detects the format from the file’s container header (works for .wav, .mp3, .flac, .ogg, .m4a, .webm).

Audio encoding of the bytes you upload. Mirrors the `encoding` parameter on the realtime WS endpoint. - `linear16`, `linear32` — raw PCM (16-bit and 32-bit) - `alaw`, `mulaw` — 8 kHz telephony codecs - `opus`, `ogg_opus` — Opus compressed audio (raw and Ogg container) When omitted, the server detects the format from the file's container header (works for `.wav`, `.mp3`, `.flac`, `.ogg`, `.m4a`, `.webm`).

webhook_urlstringOptionalformat: "uri"

webhook_extrastringOptional

word_timestampsbooleanOptionalDefaults to false

Whether to include word and utterance level timestamps in the response

diarizebooleanOptionalDefaults to false

Whether to perform speaker diarization

gender_detectionenumOptionalDefaults to false

Whether to predict the gender of the speaker

Allowed values:

emotion_detectionenumOptionalDefaults to false

Whether to predict speaker emotions

Allowed values:

formatenumOptionalDefaults to true

Master formatting switch for the transcript. When false, forces punctuate=false, capitalize=false, and also disables Inverse Text Normalization (ITN) so it cannot silently reintroduce punctuation or casing.

When true, the punctuate and capitalize params take effect independently. Leave format=true and use those two to fine-tune.

Master formatting switch for the transcript. When `false`, forces `punctuate=false`, `capitalize=false`, and also disables Inverse Text Normalization (ITN) so it cannot silently reintroduce punctuation or casing. When `true`, the `punctuate` and `capitalize` params take effect independently. Leave `format=true` and use those two to fine-tune.

Allowed values:

punctuateenumOptionalDefaults to true

When false, strips end-of-sentence punctuation (., ,, ?, !) from the transcript, words[].word, and utterances[].transcript. Does not affect casing — use capitalize for that. Overridden to false when format=false.

Allowed values:

capitalizeenumOptionalDefaults to true

When false, lowercases the entire transcript output (transcript, words[].word, and utterances[].transcript). Does not affect punctuation — use punctuate for that. Overridden to false when format=false.

Allowed values:

Request

This endpoint expects binary data of type application/octet-stream.

Response

Speech transcribed successfully

statusstring

Status of the transcription request

transcriptionstring

The transcribed text from the audio file

audio_lengthdouble

Duration of the audio file in seconds

wordslist of objects

Word-level timestamps in seconds.

utteranceslist of objects

List of utterances with start and end times

genderenum

Predicted gender of the speaker if requested

Allowed values:

emotionsobject

Predicted emotions of the speaker if requested

metadataobject

Metadata about the transcription

Errors

400

Bad Request Error

401

Unauthorized Error

413

Content Too Large Error

429

Too Many Requests Error

500

Internal Server Error

Transcribe an audio file to text using the Pulse model. The fastest way to get a transcript when you already have a recording — pass either the raw bytes or a URL.

When to use this

Input methods

Send the audio in one of two ways:

Raw bytes — Content-Type: application/octet-stream with the audio in the body. All knobs (language, word_timestamps, etc.) are query parameters.
URL — Content-Type: application/json with {"url": "..."} in the body. Useful when the audio already lives in object storage. Same query parameters apply.

Pulse autodetects the language across 30+ supported locales. Pass language explicitly when you already know it — detection is fast but skipping it is faster.

Examples

cURL (raw bytes)

$ curl -X POST "https://api.smallest.ai/waves/v1/pulse/get_text?language=en&word_timestamps=true" \
>   -H "Authorization: Bearer $SMALLEST_API_KEY" \
>   -H "Content-Type: application/octet-stream" \
>   --data-binary "@./call.wav"

cURL (URL)

$ curl -X POST "https://api.smallest.ai/waves/v1/pulse/get_text?language=en" \
>   -H "Authorization: Bearer $SMALLEST_API_KEY" \
>   -H "Content-Type: application/json" \
>   -d '{"url": "https://your-bucket.s3.amazonaws.com/call.wav"}'

Python (pip install smallestai>=4.4.0)

1 from smallestai import SmallestAI
2 
3 client = SmallestAI(token="YOUR_API_KEY")
4 with open("./call.wav", "rb") as f:
5     result = client.waves.transcribe_pulse(
6         request=f.read(),
7         language="en",
8         word_timestamps=True,
9         diarize=True,
10     )
11 print(result.status)         # "success"
12 print(result.transcription)  # the transcript string

JavaScript / TypeScript (using fetch)

1 import { readFileSync } from "node:fs";
2 
3 const audio = readFileSync("./call.wav");
4 const params = new URLSearchParams({ language: "en", word_timestamps: "true", diarize: "true" });
5 
6 const res = await fetch(`https://api.smallest.ai/waves/v1/pulse/get_text?${params}`, {
7   method: "POST",
8   headers: {
9     Authorization: `Bearer ${process.env.SMALLEST_API_KEY}`,
10     "Content-Type": "application/octet-stream",
11   },
12   body: audio,
13 });
14 const result = await res.json();
15 console.log(result.transcription);

Common gotchas

Max file size is 25 MB. Larger files return HTTP 413. Compress to mono 16 kHz PCM if you’re close to the limit; quality is unaffected.
Formatting flags (format, punctuate, capitalize) are accepted at the wire level and exposed in the Python SDK as of smallestai>=4.4.0. Today they currently return the same transcript regardless of value — pass them in your integration so it works as the behavior changes.
Webhook-driven flow: pass webhook_url to receive the transcript asynchronously. The endpoint returns immediately; the transcript hits your webhook when ready. Useful for long files where you don’t want to hold an HTTP connection open.
Speaker diarization (diarize=true) adds latency. Skip it if you only need the words.
JavaScript / TypeScript: the official smallestai npm package predates the Pulse model, so call this endpoint with fetch or axios as shown above.

1	import requests
2
3	url = "https://api.smallest.ai/waves/v1/pulse/get_text"
4
5	headers = {
6	"Authorization": "Bearer <BearerAuth>",
7	"Content-Type": "application/octet-stream"
8	}
9
10	response = requests.post(url, headers=headers)
11
12	print(response.json())

1	{
2	"status": "success",
3	"transcription": "Hello world.",
4	"words": [
5	{
6	"start": 0,
7	"end": 0.5,
8	"speaker": "speaker_0",
9	"word": "Hello"
10	},
11	{
12	"start": 0.6,
13	"end": 0.9,
14	"speaker": "speaker_0",
15	"word": "world."
16	}
17	],
18	"utterances": [
19	{
20	"text": "Hello world.",
21	"start": 0,
22	"end": 0.9,
23	"speaker": "speaker_0"
24	}
25	],
26	"gender": "male",
27	"emotions": {
28	"happiness": 0.8,
29	"sadness": 0.15,
30	"disgust": 0.02,
31	"fear": 0.03,
32	"anger": 0.05
33	},
34	"metadata": {
35	"filename": "audio.mp3",
36	"duration": 1.7,
37	"fileSize": 1000000
38	}
39	}

$	curl -X POST "https://api.smallest.ai/waves/v1/pulse/get_text?language=en&word_timestamps=true" \
>	-H "Authorization: Bearer $SMALLEST_API_KEY" \
>	-H "Content-Type: application/octet-stream" \
>	--data-binary "@./call.wav"

$	curl -X POST "https://api.smallest.ai/waves/v1/pulse/get_text?language=en" \
>	-H "Authorization: Bearer $SMALLEST_API_KEY" \
>	-H "Content-Type: application/json" \
>	-d '{"url": "https://your-bucket.s3.amazonaws.com/call.wav"}'

1	from smallestai import SmallestAI
2
3	client = SmallestAI(token="YOUR_API_KEY")
4	with open("./call.wav", "rb") as f:
5	result = client.waves.transcribe_pulse(
6	request=f.read(),
7	language="en",
8	word_timestamps=True,
9	diarize=True,
10	)
11	print(result.status) # "success"
12	print(result.transcription) # the transcript string

1	import { readFileSync } from "node:fs";
2
3	const audio = readFileSync("./call.wav");
4	const params = new URLSearchParams({ language: "en", word_timestamps: "true", diarize: "true" });
5
6	const res = await fetch(`https://api.smallest.ai/waves/v1/pulse/get_text?${params}`, {
7	method: "POST",
8	headers: {
9	Authorization: `Bearer ${process.env.SMALLEST_API_KEY}`,
10	"Content-Type": "application/octet-stream",
11	},
12	body: audio,
13	});
14	const result = await res.json();
15	console.log(result.transcription);