The pre-recorded REST endpoint (POST /waves/v1/pulse/get_text) now lists the encoding query parameter alongside the streaming WebSocket. Same 6-value enum: linear16, linear32, alaw, mulaw, opus, ogg_opus. Default is linear16.

When omitted, the server falls back to detecting the format from the file’s container header (works for .wav, .mp3, .flac, .ogg, .m4a, .webm).

Pulse STT — `age_detection` removed from the pre-recorded HTTP API

The age_detection query parameter and the corresponding top-level age field in the response have been removed from the Pulse STT pre-recorded HTTP API (POST /waves/v1/pulse/get_text). Gender detection (gender_detection / gender) and emotion detection (emotion_detection / emotions) are unaffected.

Specs and reference docs updated:

fern/apis/waves/openapi/pulse-stt-openapi.yaml — age_detection query param dropped; age response field and example value removed.
fern/products/waves/pages/v4.0.0/api-references/pulse-stt.mdx (+ versions mirror) — cURL/Python/JavaScript samples for both raw-bytes and audio-URL methods no longer pass age_detection.
fern/products/waves/pages/v4.0.0/speech-to-text/pre-recorded/code-examples.mdx (+ versions mirror) — Python end-to-end sample no longer requests or prints age.
fern/products/waves/pages/v4.0.0/speech-to-text/features/age-and-gender-detection.mdx (+ versions mirror) — page retitled to Gender detection and trimmed to gender-only content. The file path is unchanged so existing /features/age-and-gender-detection links keep resolving.
fern/products/waves/pages/v4.0.0/integrations/n8n.mdx, speech-to-text/overview.mdx, speech-to-text/pre-recorded/features.mdx, speech-to-text/model-cards/pulse.mdx, and the STT benchmarks metrics-overview.mdx — surrounding tables, accordions, and feature cards updated to drop age references.
fern/products/waves/versions/v4.0.0.yml — sidebar entry retitled to Gender Detection.

If your code passes age_detection=true or reads response.age, drop both — the parameter is now ignored and the field will not be returned. No other Pulse STT request shape or response field changes.

→ Gender detection

May 6, 2026

Pulse STT — recommend `itn_normalize` over `numerals` for new integrations

The numerals query parameter on the Pulse STT WebSocket API still works and continues to behave as documented. For new integrations we now recommend itn_normalize=true instead — it covers digits as well as dates, currencies, phone numbers, and other spoken-form entities, and gives more consistent results across languages.

Existing code that uses numerals does not need to change.

→ Inverse Text Normalization

May 4, 2026

Pre-rename Pulse STT WebSocket spec removed from SDK generation

The pre-rename AsyncAPI spec for wss://api.smallest.ai/waves/v1/asr has been removed. This is the final piece of the legacy STT WebSocket cleanup. Pulse STT (wss://api.smallest.ai/waves/v1/pulse/get_text) has been the supported real-time STT surface throughout; the older channel was already unlinked from the v4 API reference navigation and was kept only for historical SDK generation.

Files removed:

The legacy AsyncAPI spec and its overrides under fern/apis/waves/asyncapi/.
The corresponding generator entries in fern/apis/waves/generators.yml and fern/apis/unified/generators.yml.

SDK impact: the next SDK release will no longer expose the legacy streaming-STT methods generated from the old spec. Existing SDK releases keep working until upgrade. Migrate to Pulse STT — see the Pulse STT WebSocket reference and the Pulse quickstart.

→ Pulse STT quickstart

May 3, 2026

STT Performance page rebuilt against the FLEURS, ESB, and WildASR sources; speaker-cap language scrubbed from the Pulse model card

The Pulse STT Performance page (/waves/documentation/speech-to-text-pulse/benchmarks/performance) now reflects the canonical benchmark comparison across four datasets:

FLEURS — pre-recorded (29 languages)
FLEURS — streaming (8 languages)
ESB — streaming (9 English domains)
WildASR — streaming (8 robustness conditions)
Internal English perturbation suite (12 categories)

Each dataset section includes the source description and the full Smallest Pulse vs Deepgram Nova 2 vs Nova 3 comparison. Several previously published sections that did not come from any source benchmark — “Performance by Language”, “High-Performance Languages”, “Regional Variations”, “Performance by Audio Format”, and “Feature Impact on Performance” — have been removed. The latency-per-language pairing (e.g. “Italian: 4.2% WER, ~64ms latency”) was incorrect; latency is reported once at the top of the page and is not language-specific.

The Pulse model card (/waves/model-cards/speech-to-text/pulse) is also updated:

Speaker-diarization phrasing such as “capped at 4 speakers; known issues” has been removed from the Key Capabilities card, the Features — Non-streaming table, and the Limitations & Safety bullets.
The ESB, WildASR, and Internal English perturbation tables are now mirrored into the model card so the per-feature page and the model card stay in sync.
The single hyperlink on Keyword boosting in the Features — Streaming table has been removed for consistency with the rest of the table (none of the other features link out from the cell).

May 1, 2026

Legacy Pulse STT WebSocket reference removed from docs

The legacy WebSocket reference at wss://api.smallest.ai/waves/v1/lightning/get_text has been removed from the docs. Pulse STT (wss://api.smallest.ai/waves/v1/pulse/get_text) is the supported real-time STT surface; the legacy endpoint was already unlinked from the v4 API reference navigation and was only kept for historical reference.

Files removed:

The legacy WebSocket API reference page (and its versions/v4.0.0 mirror).
The legacy AsyncAPI spec and its overrides under fern/apis/waves/asyncapi/.
The corresponding generator entries in fern/apis/waves/generators.yml and fern/apis/unified/generators.yml.
The orphan-page allow-list entry in scripts/.nav-ignore.

If your code still calls wss://api.smallest.ai/waves/v1/lightning/get_text, migrate to Pulse STT — the request shape is similar; see the Pulse STT quickstart and the Pulse WebSocket reference.

→ Pulse STT quickstart

May 1, 2026

Pulse STT — `multi-indic` and `multi-asian` clarified as pre-recorded HTTP only (correction)

The multi-indic and multi-asian regional auto-detection scopes announced earlier today apply only to the Pulse STT pre-recorded HTTP endpoint (POST /waves/v1/pulse/get_text). They are not supported on the WebSocket streaming endpoint (wss://api.smallest.ai/waves/v1/pulse/get_text).

Specs and reference docs corrected accordingly:

fern/apis/waves/asyncapi/pulse-stt-ws.yaml — multi-indic / multi-asian removed from the WebSocket language enum.
fern/apis/waves/asyncapi/pulse-stt-ws-overrides.yml — same.
fern/apis/waves-v4/overrides/pulse-stt-ws-overrides.yml — same.
fern/products/waves/pages/v4.0.0/api-references/pulse-stt-ws.mdx (+ versions mirror) — WebSocket reference table cell trimmed to multi-eu / multi only.
fern/products/waves/pages/v4.0.0/speech-to-text/pre-recorded/quickstart.mdx (+ versions mirror) — language Note now lists all four scopes (multi-eu, multi-indic, multi-asian, multi).

The pre-recorded OpenAPI spec (fern/apis/waves/openapi/pulse-stt-openapi.yaml) and the rendered API reference page (pulse-stt.mdx) continue to advertise all four scopes — that surface is unchanged.

The Pulse STT WebSocket probe (scripts/probes/pulse_stt.py) was also stripped of the two invalid test cases that were probing the WS endpoint with these scopes; the probe baseline (scripts/probes/baseline-pulse-stt.json) was regenerated.

If your code already passes language=multi-indic or language=multi-asian to the pre-recorded HTTP endpoint, no action needed — that path is correct. If you wired either scope into a WebSocket streaming call, switch back to multi-eu or multi until streaming-side coverage ships.

→ Pulse STT pre-recorded

May 1, 2026

Pulse STT pre-recorded — `multi-indic` and `multi-asian` regional auto-detection modes

The language parameter on the Pulse STT pre-recorded HTTP endpoint now accepts two new regional auto-detection scopes alongside the existing multi-eu and multi:

multi-indic — auto-detects across the Indic set: en, hi, mr, pa, gu, or, ka, ta, te, ml, bn. Use for Indian-language audio with optional English code-switching.
multi-asian — auto-detects across the East Asian set: en, ja, ko, zh, yue. Use for Japanese / Korean / Mandarin / Cantonese audio with optional English code-switching.

The existing scopes are unchanged:

multi-eu (default) — de, en, fr, it, nl, pt, ru, es.
multi — full multilingual auto-detection across all supported languages.

Pick the narrowest scope that matches your audio for the best accuracy. Omitting language still routes to multi-eu and can mis-detect on non-European audio (e.g., returning Russian for English input).

The new multi-indic and multi-asian scopes are only available on the pre-recorded HTTP endpoint for now. The realtime WebSocket streaming endpoint continues to support only multi-eu and multi for auto-detection until streaming-side coverage ships.

Specs updated:

OpenAPI (pre-recorded): fern/apis/waves/openapi/pulse-stt-openapi.yaml

→ Pulse STT pre-recorded

April 30, 2026

Pulse STT — `multi-eu` is now the documented default; `full_transcript` removed

The language parameter now defaults to multi-eu in the Pulse STT WebSocket and HTTP specs. Set language explicitly to the known language for best accuracy. Use multi-eu when the source is unknown European-language audio; use multi for full multilingual auto-detection. Omitting language routes to multi-eu and can mis-detect on non-European audio (e.g., returning Russian for English input).

The full_transcript query parameter and response field were removed from the spec. The server no longer returns the field. Reconstruct a session-level transcript on the client by concatenating each is_final=true transcript value — every WS code sample on the realtime surface (Python, Node.js, browser JS, mic) was updated to show the pattern.

Keyword boosting docs now spell out that the value is a single comma-separated string, not a JSON array. keywords=["I:20,smiling:26"] and keywords=['I:20,smiling:26'] silently pass and produce garbled transcripts because the API parses the brackets as keyword characters; the correct shape is keywords=I:20,smiling:26.

→ Pulse STT WebSocket reference → Keyword boosting

April 21, 2026

Pulse STT `close_stream` signal

Send {"type": "close_stream"} to end a Pulse STT session. The server flushes remaining audio, emits final transcripts, responds with is_last=true, then closes the WebSocket.

{"type": "finalize"} is now documented as a mid-session flush: it returns an is_final=true transcript for buffered audio while keeping the session open for more input. Prior docs conflated the two.

Code examples in Python, Node.js, and browser JavaScript were updated to use close_stream when ending the stream.

→ Pulse STT realtime quickstart