Speaker diarization
Pre-Recorded
Real-Time
Enabling speaker diarization
Pre-Recorded API
Pass diarize=true when calling the Pulse STT POST endpoint. The parameter can be combined with other enrichment options (timestamps, emotions, etc.) without changing your audio payload.
Real-Time WebSocket API
Add diarize=true to your WebSocket connection query parameters when connecting to the Pulse STT WebSocket API.
Output format & field of interest
When enabled, every entry in words includes a speaker field (integer ID: 0, 1, …) and speaker_confidence field (0.0 to 1.0) for real-time API, or string labels (speaker_0, speaker_1, …) for pre-recorded API. The utterances array also carries speaker labels so you can reconstruct conversations, build turn-taking analytics, or display multi-speaker captions.

