DocumentationAPI ReferenceSelf HostClient LibrariesChangelog
DocumentationAPI ReferenceSelf HostClient LibrariesChangelog
  • Getting Started
    • Introduction
    • Models
    • Authentication
    • HTTP Streaming
  • Text to Speech
    • Overview
    • Quickstart
    • How to TTS
    • Stream TTS
    • Pronunciation Dictionaries
    • Voice Models & Languages
  • Speech to Text
    • Overview
    • Quickstart
  • Cookbooks
    • Speech to Text
  • Voice Cloning
    • Types of Cloning
    • Voice Clone via UI
    • How to Voice Clone
    • Delete Cloned Voice
    • Professional Voice Cloning
  • Integrations
    • LiveKit
    • Plivo
    • Vonage
  • Best Practices
    • Voice Cloning Best Practices
    • PVC Best Practices
    • TTS Best Practices
On this page
  • Available Features
Speech to TextRealtime (WebSocket)

Features

|View as Markdown|Open in Claude|

The Real-Time Pulse STT WebSocket API supports the following features:

Available Features

Word Timestamps

Get precise timing information for each word in the transcription with confidence scores

Language Detection

Automatically detect the language of the audio

Sentence Timestamps (Utterances)

Get sentence-level transcription segments with timing information

PII & PCI Redaction

Automatically redact personally identifiable information and payment card information

Full Transcript

Get cumulative transcript received up to this point in responses where is_final is true

Numeric Formatting

Control how numbers are formatted in transcriptions (digits, words, or auto-detect)

Word Boosting

Improve recognition accuracy for important words

Speaker Diarization

Identify and label different speakers in the audio with speaker confidence scores

Was this page helpful?
Previous

Audio Specifications

Next

Troubleshooting

Built with
LogoLogo
AtomsWaves
AtomsWaves