***
title: Sentence-level timestamps
description: Use the utterances array to capture longer segments with speaker labels
------------------------------------------------------------------------------------
Pre-Recorded
Real-Time
Sentence-level timestamps (utterances) are supported in both **Pre-Recorded** and **Real-Time** transcription APIs. The `utterances` array aggregates contiguous words into sentence-level segments, providing structured timing information for longer audio chunks.
## Enabling sentence-level timestamps
### Pre-Recorded API
For the Pre-Recorded API, set `word_timestamps=true` in your query parameters. When word timestamps are enabled, the response includes both `words` and `utterances` arrays.
Sentence-level timestamps (utterances) are supported in both Pre-Recorded and Real-Time APIs. Use `sentence_timestamps=true` for Real-Time API.
```bash
curl --request POST \
--url "https://waves-api.smallest.ai/api/v1/pulse/get_text?model=pulse&language=en&word_timestamps=true&diarize=true" \
--header "Authorization: Bearer $SMALLEST_API_KEY" \
--header "Content-Type: audio/wav" \
--data-binary "@/path/to/audio.wav"
```
### Real-Time API (WebSocket)
For the Real-Time WebSocket API, set `sentence_timestamps=true` as a query parameter when establishing the WebSocket connection.
```javascript
const url = new URL("wss://waves-api.smallest.ai/api/v1/pulse/get_text");
url.searchParams.append("language", "en");
url.searchParams.append("sentence_timestamps", "true");
const ws = new WebSocket(url.toString(), {
headers: {
Authorization: `Bearer ${API_KEY}`,
},
});
```
## Output format
Each `utterances` entry contains `text`, `start`, `end`, and optional `speaker` fields (when diarization is enabled). Use these sentence-level timestamps when you need to display readable captions, synchronize larger chunks of audio, or store structured call summaries.
## Sample response
### Pre-Recorded API
```json
{
"status": "success",
"transcription": "Hello world. How are you?",
"words": {...}
"utterances": [
{ "text": "Hello world.", "start": 0.0, "end": 0.9, "speaker": "speaker_0" },
{ "text": "How are you?", "start": 1.0, "end": 2.1, "speaker": "speaker_1" }
]
}
```
This response has the `speaker` field due to `diarize` being enabled in the query.
### Real-Time API (WebSocket)
```json
{
"session_id": "sess_12345abcde",
"transcript": "Hello world. How are you?",
"is_final": true,
"is_last": false,
"language": "en",
"utterances": [
{ "text": "Hello world.", "start": 0.0, "end": 0.9 },
{ "text": "How are you?", "start": 1.0, "end": 2.1 }
]
}
```
When `diarize=true` is enabled, the `utterances` array also includes a `speaker` field (integer ID) for real-time API responses. For example: `{ "text": "Hello world.", "start": 0.0, "end": 0.9, "speaker": 0 }`