The OpenAIClient is the unified interface for calling LLMs. It works with OpenAI by default, but supports any OpenAI-compatible endpoint by changing the base_url.
Controls how “creative” vs “predictable” the model behaves:
Any provider with an OpenAI-compatible API works by setting base_url. This includes Groq, Together.ai, Fireworks, Anyscale, OpenRouter, and Azure OpenAI.
Just swap the base_url and api_key—your agent code stays the same.
Voice agents must use streaming. Without it, users wait for the entire response before hearing anything.
To enable function calling, pass tool schemas:
Agents also need a voice for text-to-speech. Waves is our recommended TTS engine—ultra-low latency, optimized for real-time telephony.
For the complete Waves voice library with audio samples: → Waves Voice Models
Create custom voices from audio samples: → Waves Voice Cloning Guide
OpenAI and ElevenLabs voices are also supported. Set provider to openai or elevenlabs and use their respective voice IDs.
Set max_tokens to 100-200. Shorter responses mean faster audio playback. Guide conciseness in your prompt too.
The first LLM call has connection overhead. Send a tiny request in start() to warm up before the user speaks.
Configure a backup provider (e.g., Groq primary, OpenAI fallback) to handle rate limits or outages gracefully.