STT Performance page rebuilt against the FLEURS, ESB, and WildASR sources; speaker-cap language scrubbed from the Pulse model card

The Pulse STT Performance page (/waves/documentation/speech-to-text-pulse/benchmarks/performance) now reflects the canonical benchmark comparison across four datasets:

FLEURS — pre-recorded (29 languages)
FLEURS — streaming (8 languages)
ESB — streaming (9 English domains)
WildASR — streaming (8 robustness conditions)
Internal English perturbation suite (12 categories)

Each dataset section includes the source description and the full Smallest Pulse vs Deepgram Nova 2 vs Nova 3 comparison. Several previously published sections that did not come from any source benchmark — “Performance by Language”, “High-Performance Languages”, “Regional Variations”, “Performance by Audio Format”, and “Feature Impact on Performance” — have been removed. The latency-per-language pairing (e.g. “Italian: 4.2% WER, ~64ms latency”) was incorrect; latency is reported once at the top of the page and is not language-specific.

The Pulse model card (/waves/model-cards/speech-to-text/pulse) is also updated:

Speaker-diarization phrasing such as “capped at 4 speakers; known issues” has been removed from the Key Capabilities card, the Features — Non-streaming table, and the Limitations & Safety bullets.
The ESB, WildASR, and Internal English perturbation tables are now mirrored into the model card so the per-feature page and the model card stay in sync.
The single hyperlink on Keyword boosting in the Features — Streaming table has been removed for consistency with the rest of the table (none of the other features link out from the cell).