Voice Providers
Retone supports two premium TTS providers:- Cartesia (Recommended)
- ElevenLabs
Cartesia Sonic 3 offers:
- Ultra-low latency (< 100ms)
- Emotional expressiveness
- Natural prosody
- Speed and stability controls
Basic Settings
Voice Selection
Choose from available voices:- Click the Voice dropdown
- Browse available voices by provider
- Preview voices before selecting
- Click to apply
Speed
Control speaking rate:| Value | Effect |
|---|---|
| 0.5 | Half speed (very slow) |
| 0.8 | Slightly slower |
| 1.0 | Normal speed |
| 1.2 | Slightly faster |
| 2.0 | Double speed (very fast) |
Volume
Output volume level (0-100):Responsiveness
How quickly the agent starts speaking (1-10):| Value | Effect |
|---|---|
| 1-3 | Slower, more deliberate responses |
| 4-6 | Balanced (recommended) |
| 7-10 | Faster, may interrupt caller |
Emotional Expression (Cartesia)
Cartesia Sonic 3 supports emotional speech:Emotion Types
| Emotion | When to Use |
|---|---|
neutral | Professional, business contexts |
friendly | Customer service, general conversations |
excited | Sales, promotions, positive news |
sympathetic | Handling complaints, showing empathy |
curious | Asking questions, gathering information |
confident | Assertions, recommendations |
Intensity
| Level | Effect |
|---|---|
low | Subtle emotional undertone |
medium | Noticeable but not exaggerated |
high | Strong emotional expression |
Back-channeling
Natural acknowledgment sounds during conversation:When Back-channeling Triggers
- During caller’s longer statements
- At natural pauses
- When caller shares information
- To indicate active listening
Higher frequency (7-10) sounds very engaged but may feel excessive. Lower (1-3) is more subtle.
Pre-response Phrases
Filler phrases before AI generates a response:Benefits
- Reduces perceived latency
- Sounds more natural
- Gives AI processing time
Conditional Pre-responses
Trigger different phrases based on context:Laughter Injection
Add natural laughter to appropriate moments:Pronunciation Overrides
Custom pronunciations for specific words:Adding Pronunciations
- Go to Voice Settings
- Click Pronunciation Overrides
- Add word and phonetic spelling
- Test with voice preview
Auto-enabled Features
These settings are enabled by default:| Feature | Description |
|---|---|
| Allow interruptions | Caller can interrupt agent |
| Speech normalization | Numbers read naturally |
| Noise filtering | Background noise reduced |
| Background speech detection | Handles side conversations |
Configuration Example
Complete voice configuration:Testing Voice Settings
- Open the Test Call panel
- Have a conversation with your agent
- Listen for:
- Natural pacing
- Clear pronunciation
- Appropriate emotion
- Smooth back-channeling
- Adjust settings and retest
Best Practices
Match Brand Voice
Choose a voice that fits your company’s personality.
Test on Phone
Audio sounds different on phone vs. computer. Test both.
Moderate Back-channeling
Too much feels robotic; too little feels distant.
Add Key Pronunciations
Ensure company names, products, and jargon sound correct.