Create a new agent
You can create a new agent by passing the name of the agent in the request body. You can use update-workflow endpoint next to assign custom workflow to the agent.
Authentication
API key from the console ApiKey collection, sent as Bearer token. Also accepts session cookies for browser-based auth.
Request
Ambient background sound during calls. Options: ” (none), ‘office’, ‘cafe’, ‘call_center’, ‘static’. Note: this value is currently overridden by the server default on creation; update via PATCH after creation.
Language configuration for the agent.
Cross-field rule: default must be one of the values in supported.
Tamil (ta) cannot be combined with other languages in supported.
Synthesizer (TTS) configuration for the agent.
Models waves, waves_lightning_large, waves_lightning_v2, and waves_lightning_v3_1
validate voiceId against the Waves API. All other models accept any voiceId.
Cloned voices are regular voiceIds — use them with any compatible Waves model.
The global knowledge base ID of the agent. You can create a global knowledge base by using the /knowledgebase endpoint and assign it to the agent. The agent will use this knowledge base for its responses.
The LLM model to use for the agent.
Note: gpt-5.2, electron-kogta, and electron-kogta-v2 require org-level access and return 403 if not enabled.
workflowType must be single_prompt to use gpt-realtime or gpt-realtime-mini.
Set global instructions for your agent’s personality, role, and behavior throughout conversations. Note: Only used for workflow_graph agents. Maximum 4000 characters.
IDs of telephony products (phone numbers) to associate with the agent for inbound/outbound calls.
The type of workflow to create for the agent. Defaults to single_prompt if not specified. Using workflow_graph requires conversational agent access (403 if not enabled).
Smart turn-detection configuration. When enabled, the agent uses an additional model to decide whether the user has finished a turn.
Voice activity detection (VAD) configuration. Controls how the agent decides when speech is present.
Voicemail-detection configuration. When the call hits a voicemail tone, the agent plays endText and ends the call.
Background-noise denoising configuration for the agent’s input audio.
Pronunciation overrides — words the TTS engine should pronounce differently from its default.

