DocumentationAPI ReferenceSelf HostModel CardsClient LibrariesDeveloper ToolsChangelog
DocumentationAPI ReferenceSelf HostModel CardsClient LibrariesDeveloper ToolsChangelog
  • Getting Started
    • Introduction
    • Models
    • Authentication
  • Text to Speech (Lightning)
    • Quickstart
    • Overview
    • Sync & Async
    • Streaming
    • Pronunciation Dictionaries
    • Voices & Languages
    • HTTP vs Streaming vs WebSockets
  • Speech to Text (Pulse)
    • Quickstart
    • Overview
      • Quickstart
      • Response Format
      • Audio Formats
      • Features
      • Troubleshooting
      • Best Practices
      • Code Examples
  • Cookbooks
    • Speech to Text
    • Text to Speech
  • Voice Cloning
    • Instant Clone (UI)
    • How to Voice Clone
    • Delete Cloned Voice
  • Integrations
    • Vercel AI SDK
    • OpenClaw
    • LiveKit
    • Pipecat
    • Plivo
    • Vonage
    • n8n
  • Best Practices
    • Voice Cloning Best Practices
    • TTS Best Practices
On this page
  • Available Features
Speech to Text (Pulse)Realtime (WebSocket)

Features

||View as Markdown|

The Real-Time Pulse STT WebSocket API supports the following features:

Available Features

Word Timestamps

Get precise timing information for each word in the transcription with confidence scores

Language Detection

Automatically detect the language of the audio

Sentence Timestamps (Utterances)

Get sentence-level transcription segments with timing information

PII & PCI Redaction

Automatically redact personally identifiable information and payment card information

Full Transcript

Get cumulative transcript received up to this point in responses where is_final is true

Numeric Formatting

Control how numbers are formatted in transcriptions (digits, words, or auto-detect)

Speaker Diarization

Identify and label different speakers in the audio with speaker confidence scores

Keyword Boosting

Boost recognition accuracy for specific words, brand names, and domain terms

Punctuation Formatting

Control punctuation and capitalization formatting in transcripts

End-of-Utterance Timeout

Control how long Pulse waits after speech ends before finalizing the transcript

Was this page helpful?
Edit this page
Previous

Audio Specifications

Next

Troubleshooting

Built with
LogoLogo
Voice AgentsModels
Voice AgentsModels