***

title: Best Practices
description: Prepare audio inputs before submitting them to Pulse STT
---------------------------------------------------------------------

# Pre-recorded best practices

Follow these recommendations to keep Pulse STT latencies low while preserving transcript fidelity.

## Audio preprocessing workflow

### Convert with FFmpeg

```bash
# Convert to 16 kHz mono WAV (recommended ingest format)
ffmpeg -i input.mp3 -ar 16000 -ac 1 -sample_fmt s16 output.wav

# Convert to MP3 with optimal speech settings
ffmpeg -i input.wav -ar 16000 -ac 1 -b:a 128k output.mp3
```

### Python example

```python
from pydub import AudioSegment

audio = AudioSegment.from_file("input.mp3")
audio = audio.set_frame_rate(16000).set_channels(1)
audio.export("output.wav", format="wav")
```

### JavaScript example

```javascript
import { createFFmpeg, fetchFile } from '@ffmpeg/ffmpeg';

const ffmpeg = createFFmpeg({ log: true });
await ffmpeg.load();

ffmpeg.FS('writeFile', 'input.mp3', await fetchFile('input.mp3'));
await ffmpeg.run('-i', 'input.mp3', '-ar', '16000', '-ac', '1', 'output.wav');
const data = ffmpeg.FS('readFile', 'output.wav');
```

## Quality checklist

1. **Use 16 kHz mono** whenever possible; downsample higher-fidelity recordings.
2. **Normalize audio levels** so peaks stay consistent across large batches.
3. **Remove silence** at the beginning and end to avoid wasted compute.
4. **Handle multiple speakers** by enabling diarization when agents and customers share a channel.
5. **Test with a sample clip** before launching full backfills to validate accuracy and metadata.