Unified Speech-to-Text endpoint, Pulse Pro model
Unified Speech-to-Text endpoint, Pulse Pro model
The Speech-to-Text API now lives at the unified path /waves/v1/stt/, mirroring the unified TTS shape. The model is selected via the ?model= query parameter. Two models are live today:
?model=pulse: multilingual (17 streaming + 26 pre-recorded languages), HTTP + WebSocket streaming.?model=pulse-pro: leaderboard-ranked English STT (5.42% ESB avg WER, tied #2 on the public Open ASR Leaderboard). HTTP only.
Pulse Pro on the streaming endpoint (WS /waves/v1/stt/live?model=pulse-pro) is rejected with 400 before WebSocket upgrade because the streaming worker is not yet deployed. Use the HTTP endpoint and pass webhook_url for long files.
Customer pricing (Standard plan):
- Pulse, streaming (WebSocket): $0.006 / minute
- Pulse, non-streaming (HTTP): $0.0035 / minute
- Pulse Pro, non-streaming (HTTP): $0.004 / minute
Standard plan rate limits default to 25 RPM per model and 100 concurrent WebSocket sessions. Enterprise is unlimited and configurable per-customer.
The existing endpoints (POST /waves/v1/pulse/get_text and WS /waves/v1/pulse/get_text) continue to work alongside the new unified path. New integrations are encouraged to use /waves/v1/stt/ since it carries both models behind one path.
- Pulse Pro model card
- Speech-to-Text quickstart covers both models.

