LLM Settings
The OpenAIClient is the unified interface for calling LLMs. It works with OpenAI by default, but supports any OpenAI-compatible endpoint by changing the base_url.
Basic Configuration
Parameters
Temperature
Controls how “creative” vs “predictable” the model behaves:
- 0.0–0.3: Consistent, factual. Best for support, FAQ bots.
- 0.4–0.6: Balanced. Good for general conversation.
- 0.7–1.0: Creative, varied. Better for sales, engagement.
Using Other Providers
Any provider with an OpenAI-compatible API works by setting base_url. This includes Groq, Together.ai, Fireworks, Anyscale, OpenRouter, and Azure OpenAI.
Just swap the base_url and api_key—your agent code stays the same.
Streaming
Voice agents must use streaming. Without it, users wait for the entire response before hearing anything.
Tool Calling
To enable function calling, pass tool schemas:
Error Handling
Voice Configuration
Agents also need a voice for text-to-speech. Waves is our recommended TTS engine—ultra-low latency, optimized for real-time telephony.
Basic Voice Setup
Waves Voice IDs
For the complete Waves voice library with audio samples: → Waves Voice Models
Voice Cloning
Create custom voices from audio samples: → Waves Voice Cloning Guide
Third-Party Providers
OpenAI and ElevenLabs voices are also supported. Set provider to openai or elevenlabs and use their respective voice IDs.
Tips
Keep max_tokens low for voice
Set max_tokens to 100-200. Shorter responses mean faster audio playback. Guide conciseness in your prompt too.
Warm up connections on start
The first LLM call has connection overhead. Send a tiny request in start() to warm up before the user speaks.
Use fallbacks for reliability
Configure a backup provider (e.g., Groq primary, OpenAI fallback) to handle rate limits or outages gracefully.

