Performance
Latency Metrics
Time-to-First-Transcript (TTFT)
Our Pulse STT model provides State of the art TTFT latency of ~64ms, which is one of the least in the world.
TTFT Comparison Analysis
TTFT (Time to First Transcript) measures the latency between when a user stops speaking and when the model returns the complete transcript. Lower TTFT means faster response times and better user experience in real-time applications.
Accuracy Metrics
Word Error Rate (WER)
All models were evaluated on the FLEURS dataset, a standardised multilingual speech benchmark ensuring fair cross-model comparison.
Throughput
Requests Per Second
Throughput varies based on audio length, format, and server load
Performance by Audio Format
Linear16 (PCM)
- Latency: Lowest (~64ms)
- Accuracy: Highest
- Bandwidth: Highest
- Best for: High-quality applications
Opus
- Latency: Low (~70-80ms)
- Accuracy: High
- Bandwidth: Low
- Best for: Browser/mobile applications
FLAC
- Latency: Medium (~80-90ms)
- Accuracy: Highest
- Bandwidth: Medium
- Best for: Archival/quality-critical use cases
μ-law
- Latency: Low (~65-75ms)
- Accuracy: Good
- Bandwidth: Lowest
- Best for: Telephony applications
Performance by Language
High-Performance Languages
- Italian: 4.2% WER, ~64ms latency
- English: 5.1% WER, ~64ms latency
- Spanish: 5.4% WER, ~64ms latency
- Portuguese: 7.1% WER, ~64ms latency
- German: 8.5% WER, ~64ms latency
- French: 9.2% WER, ~64ms latency
Regional Variations
- Indian Languages: 10-15% WER, ~90-100ms latency
- Eastern European: 9-12% WER, ~85-95ms latency
Feature Impact on Performance
Diarization
- Latency Impact: +10-20ms
- Accuracy Impact: Minimal
- Use When: Multiple speakers present
Word Timestamps
- Latency Impact: +5-10ms
- Accuracy Impact: None
- Use When: Timing information needed
Emotion Detection
- Latency Impact: +15-25ms
- Accuracy Impact: None
- Use When: Emotion analysis required
Optimization Tips
- Use 16kHz sample rate for optimal balance
- Choose linear16 format for lowest latency
- Enable only needed features to reduce latency
- Batch process when latency isn’t critical

