This section is about how your agent speaks and listens – the TTS voice it uses, how it handles audio, what it does mid-conversation, and how it transcribes what callers say. It is not about how customers reach your agent. For that, see Phone (Numbers and Web Calling) and Chat.Documentation Index
Fetch the complete documentation index at: https://polyai-mintlify-b5dcd364.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
What lives here
Agent Voice
Pick the TTS voice and tune stability, clarity, and speed.
Voice library
Browse and compare voices across providers (ElevenLabs, Cartesia, Hume, and more).
Choosing a good voice
Match voice to brand, audience, and industry.
Multi-voice
Use multiple voices in a single project.
Voice configuration
Greeting audio, disclaimer playback, model selection, safety filters, and call handling.
Response control
Translations, pronunciations, and stop keywords.
Audio management
Audio playback and recording settings.
Speech recognition
ASR settings – how the agent transcribes what callers say.
How to think about it
There are four concerns, and they’re configured separately:- How it sounds – Agent voice, voice library, multi-voice, custom voice. Picking a TTS voice and tuning it.
- How it behaves in a call – Voice configuration: greeting, disclaimer, model selection, safety filters, call handling.
- How it listens – Speech recognition (ASR) settings.
- How it phrases things – Response control: translations, pronunciations, stop keywords.
Programmatic voice configuration
You can also configure voices programmatically using the voice class inside functions – for example, selecting a voice based on conversation context, caller preferences, or other runtime variables.Voice conversation style guide
These guidelines help your voice agent sound natural rather than robotic. They focus on the linguistic patterns that make spoken conversations feel human.Social presence markers
Natural conversation includes patterns that acknowledge conversational history and participants. These contribute to a sense of collaboration rather than rote routine-following. Use progressive tense for active collaboration:- “I’m not seeing any accounts under that phone number…” conveys active collaboration
- “I don’t see any accounts” sounds too definitive
- “How about Wednesday instead?” (not “How about Wednesday instead of Tuesday?”)
- “In that case, how does Saturday at 2:30 sound?” (not “Since you said you prefer weekends…”)
- “Could you read me your account number?” rather than “Could you read your account number aloud?”
- “Can you log into your account for me?” rather than “Can you log into your account?”
- “When were you trying to come in?” rather than “When are you trying to come in?”
Avoid over-explaining
LLMs tend to justify every action in a way humans don’t. Most of the time, the important information and the request can be formed into a single sentence:- “No problem, what’s your account number?” rather than “To check for outages, I’ll need to look up your account. Could you tell me your account number?”
Walkthrough conversations
When giving multi-turn walkthroughs, don’t end every step with “let me know when you’ve done that.” Provide the instruction and wait – the user will confirm on their own.Related
Phone
Voice transports – let customers reach your agent over the phone (Numbers) or from your website (Web Calling).
Chat
Add a text-based chat widget to your website.

