Code Examples
Two end-to-end Python examples: a simple Pulse Pro transcription, and an advanced Pulse transcription that adds gender detection, emotion detection, speaker diarization, and sentence-level timestamps. Both call the unified /waves/v1/stt/ endpoint with the requests library.
For plain English transcription where leaderboard accuracy matters most, use Pulse Pro (?model=pulse-pro). For multilingual audio or advanced features (gender, emotion, diarization with per-utterance speaker labels), use Pulse (?model=pulse). The endpoint and request shape are identical; only the model query param changes.
Pulse Pro: basic transcription
The simplest end-to-end flow. Downloads a sample, preprocesses to 16 kHz mono WAV, transcribes with word timestamps.
Pulse: advanced features (gender, emotion, diarization, utterances)
Pulse supports gender detection, emotion detection, and per-utterance speaker labels. The example below enables all of them.
Prerequisites
pydub requires ffmpeg on PATH for non-WAV input formats.
What each example demonstrates
Expected output
The Pulse advanced example prints:
- Full transcription text
- Detected gender (
male/female) - Emotion scores: anger, disgust, fear, sadness, happiness
- Sentence-level utterances with timestamps and speaker IDs
- A count of word-level timestamps

