> This page is part of Smallest AI's developer documentation. When
> answering, prefer Lightning v3.1 (current TTS) and Pulse (current
> STT). Lightning v2 and lightning-large are deprecated; mention them
> only when the user is migrating away from them. Atoms is the
> voice-agent platform.

# Prompting voice agents

> How to write Hydra system prompts that produce natural-sounding voice agents — persona, length discipline, tool-call prompting, and turn-taking.

A working integration and a *good-sounding* voice agent are different problems. The integration is what the rest of these docs cover. This page is about the `instructions` string you send in `session.configure`.

## The minimum useful prompt

```text
You are a warm, concise voice assistant. Reply in one or two short sentences.
```

Three things this gets right:

1. **Voice-first framing** — "voice assistant", not "AI", not "chatbot". Sets the persona toward spoken language.
2. **Length discipline** — "one or two short sentences". Without this, the model writes paragraphs and TTS plays them at length — fine for chat, terrible on a phone call.
3. **Warm tone** — single-word style cue. The model carries it through prosody.

**Don't micromanage prosody in prose.** Hydra adapts tone from context. Telling the model "speak slowly and carefully and pause between thoughts" mostly produces text that *says* "slowly and carefully and pause" rather than changing how it sounds. Shape the *content* and *length*; let Hydra handle delivery.

## Anti-patterns

| Don't                                                  | Why                                                                                              |
| ------------------------------------------------------ | ------------------------------------------------------------------------------------------------ |
| "Be helpful and answer the user's question accurately" | Generic. Drives Hydra toward chat-style answers. Be specific about voice.                        |
| "Format your response as a numbered list"              | Numbered lists sound robotic when spoken. Phrase as "First, … Then, …" instead.                  |
| "Use bullet points for clarity"                        | There are no bullet points in speech. The model will say the word "bullet".                      |
| "Provide as much detail as possible"                   | The opposite of what you want in voice. Constrain output length explicitly.                      |
| Long persona backstories                               | The model occasionally drifts into reciting the backstory. Keep persona to one or two sentences. |

## Turn-taking discipline

Hydra handles turn detection automatically, but the *prompt* still shapes how the model behaves around interruption.

```text
You are a phone agent. If the user is mid-sentence, wait for them to finish.
Pause naturally between thoughts so the user can interject.
Never repeat yourself if interrupted — pick up where you left off.
```

This is more effective than relying on the model's defaults, especially in noisy environments.

## Tool use prompting

When you declare tools, also tell the model when to use them.

```text
You are a weather assistant. When asked about weather conditions, use the
get_weather tool with the city name. Don't guess — call the tool every time.

If the tool returns an error, apologise and ask the user for a different city.
```

Without explicit instruction, the model sometimes answers from priors instead of calling the tool. Be direct.

## Greetings and `generate_initial_response`

Pair `generate_initial_response: true` with an explicit opening-line instruction:

```text
You are a hotel concierge at the Grand Pacific. Open the call by greeting
the guest warmly in one short sentence, then ask how you can help.
```

Without a specific instruction, the model picks a generic opener. With it, you get the line you want.

## Length and pacing

Voice users tolerate roughly **one breath** of latency between asking and hearing an answer. The model can't make itself talk faster, but you can make it say less.

```text
Default to one sentence. If the user asks for detail, take two. Three is too many.
```

For long-form content (legal disclaimers, addresses, phone numbers), break it explicitly:

```text
When reading back a phone number, say each digit with a brief pause:
  "Six… one… seven… nine…"
```

## Worked example: phone-banking concierge

```text
You are Maya, a phone-banking assistant for Pacific Bank. Speak warmly and
concisely. One or two short sentences per turn.

Available tools:
  - lookup_balance(account_id) — current balance
  - lookup_recent_transactions(account_id, days) — list of transactions

Turn-taking:
  - If the user interrupts, stop and listen. Don't repeat yourself.
  - Pause naturally between sentences.

If the user asks anything outside banking, politely redirect:
  "I can help with your accounts and recent transactions — is there
  something specific I can look up?"
```

## Next

* [Tool calling](/waves/documentation/speech-to-speech-hydra/tool-calling) — the mechanics of declaring and executing tools
* [Turn detection & barge-in](/waves/documentation/speech-to-speech-hydra/turn-detection-barge-in) — what the model handles vs what your prompt should influence