Pulse STT — streaming now supports ja / yue / zh / ko + multi-asian aggregator (US region)
Pulse STT — streaming now supports ja / yue / zh / ko + multi-asian aggregator (US region)
The Pulse streaming Speech-to-Text API now supports four Asian languages: Japanese (ja), Cantonese (yue), Mandarin (zh), and Korean (ko). The multi-asian aggregator is also available for unknown East Asian audio — it auto-detects across the same four-language set.
These five language values are served from the US region only — connect to wss://api.us.smallest.ai/waves/v1/stt/live?model=pulse (or the legacy wss://api.us.smallest.ai/waves/v1/pulse/get_text) instead of the default wss://api.smallest.ai/... host. Requesting any of them on the default (ap-south-1) host closes the connection without a transcription frame.
All other Pulse streaming parameters work as before: punctuate, capitalize, numerals, word_timestamps, diarize, redact_pii, redact_pci, sample rates 8000/16000/22050/24000/44100/48000, encodings linear16 (default) / linear32 / alaw / mulaw / opus / ogg_opus.
Pre-recorded (batch) is unchanged. Cantonese (yue) is not enabled on the batch endpoint at all; Japanese, Mandarin, and Korean behave separately from streaming and are not formally supported on batch — use the streaming endpoint for these four languages.
Example response frame:
→ Pulse model card — Supported Languages → Speech-to-Text overview

