Multilingual Agents
Building voice agents that support multiple languages
Multilingual Agents
Agent Studio supports building voice agents that can converse in multiple languages. This guide covers how to configure agents for multilingual support.
Overview
Multilingual support involves three components:
- Language Detection - Determining the user's preferred language
- Prompts - Instructions written to respond in the correct language
- Voice Selection - Choosing the right TTS voice for the language
┌─────────────────────────────────────────────────────────────────┐
│ Multilingual Voice Call │
├─────────────────────────────────────────────────────────────────┤
│ │
│ user_context.language = "ta" (Tamil) │
│ ↓ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Prompt: "Speak ONLY in Tamil..." │ │
│ └─────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ TTS: voices.ta = "tamil-female-voice-id" │ │
│ └─────────────────────────────────────────────────────────┘ │
│ ↓ │
│ Agent speaks in Tamil with Tamil voice │
│ │
└─────────────────────────────────────────────────────────────────┘Language Configuration
Supported Languages
Configure supported languages in the agent config:
{
"config": {
"languages": ["hi", "en", "ta", "te", "bn", "mr", "gu", "kn", "ml", "ur"],
"default_language": "hi"
}
}Language Code Reference
| Code | Language |
|---|---|
hi | Hindi |
en | English |
ta | Tamil |
te | Telugu |
bn | Bengali |
mr | Marathi |
gu | Gujarati |
kn | Kannada |
ml | Malayalam |
ur | Urdu |
Backend Integration
The backend determines the user's language and provides it when dispatching calls.
Dispatch with Language
# Backend dispatches call with user's language
response = await client.post("/api/v1/calls", json={
"workflow_slug": "meal-logging",
"user_context": {
"name": "Priya",
"language": "ta", # Tamil
"language_name": "Tamil", # Display name for prompts
"voice_id": "tamil-female-voice-id"
}
})Dynamic Language Context
Provide all language-related context from the backend:
# In your backend application
def get_user_call_context(user_id: str) -> dict:
user = get_user(user_id)
language = user.preferred_language or "hi"
return {
"name": user.name,
"language": language,
"language_name": LANGUAGE_NAMES[language],
"greeting": generate_greeting(language, user.name),
"closing_message": get_closing_message(language)
}
# Language name mapping
LANGUAGE_NAMES = {
"hi": "Hindi",
"en": "English",
"ta": "Tamil",
"te": "Telugu",
"bn": "Bengali",
"mr": "Marathi",
"gu": "Gujarati",
"kn": "Kannada",
"ml": "Malayalam",
"ur": "Urdu",
}Prompt Design
Language Instructions
Include explicit language instructions in your system prompt:
{
"prompt": {
"system": "You are Tap Health Coach, a friendly health coach.\n\nLANGUAGE: Speak ONLY in {{user.language_name}}. All responses must be in {{user.language_name}}.\n\n- If Hindi: Use natural spoken Hindi (Hinglish mix is fine)\n- If English: Use simple, warm English\n- For regional languages: Use natural spoken form\n\n..."
}
}Language-Specific Style Guide
LANGUAGE & VOICE STYLE:
If {{user.language_name}} is Hindi:
- Use natural spoken Hindi (Hinglish mix acceptable)
- Use familiar terms: roti, dal, sabzi, chai, paratha
- "Namaste!" / "Aap kaise ho?"
If {{user.language_name}} is Tamil:
- Use natural spoken Tamil
- Use familiar terms: idli, dosa, sambar, rasam
- "Vanakkam!" / "Eppadi irukkinga?"
If {{user.language_name}} is Telugu:
- Use natural spoken Telugu
- Use familiar terms: pesarattu, upma, pappu
- "Namaskaram!" / "Ela unnaru?"
General:
- Keep responses SHORT (8-12 words max)
- Warm, supportive tone
- Simple everyday languageDynamic Greetings Based on User Type
The greeting content and interruptibility depend on whether the user is new or returning:
New Users - Full Onboarding
New users receive a comprehensive welcome that explains the service. This greeting is non-interruptible.
# Full welcome script for new users
def get_welcome_script(name: str, language: str) -> str:
lang_name = LANGUAGE_NAMES.get(language, "Hindi")
return f"""IMPORTANT: Speak ONLY in {lang_name}.
Namaste {name}! Welcome to Tap Health.
I'm your nutrition coach, and I'll be calling you every day to help you log your meals.
Why is this important? What you eat directly affects your blood sugar. By logging your meals together, we can spot patterns and help you make better choices.
Here's how it works. Just tell me what you ate, how much, and around what time.
For example: 2 rotis with dal at 1 PM. Or 1 cup chai in the evening.
Now, let's log your meals for today."""Returning Users - Quick Greeting
Returning users get a short greeting and can interrupt immediately:
# Short greetings for returning users
RETURNING_GREETINGS = {
"hi": "Hi {name}! Aaj {meal} mein kya khaya?",
"en": "Hi {name}! What did you have for {meal} today?",
"ta": "வணக்கம் {name}! இன்று {meal} என்ன சாப்பிட்டீர்கள்?",
"te": "హాయ్ {name}! ఈ రోజు {meal} ఏమి తిన్నారు?",
}
def generate_greeting(language: str, name: str, first_meal: str) -> str:
template = RETURNING_GREETINGS.get(language, RETURNING_GREETINGS["hi"])
return template.format(name=name, meal=first_meal)Backend Determines Greeting Type
def get_user_call_context(user_id: str) -> dict:
user = get_user(user_id)
language = user.preferred_language or "hi"
pending_meals = get_pending_meals(user_id)
is_new_user = user.call_count == 0
if is_new_user:
greeting = get_welcome_script(user.name, language)
greeting_interruptible = False # Full onboarding
else:
greeting = generate_greeting(language, user.name, pending_meals[0])
greeting_interruptible = True # Can interrupt
return {
"name": user.name,
"is_new_user": is_new_user,
"greeting": greeting,
"greeting_interruptible": greeting_interruptible,
# ... other fields
}Agent Config
{
"prompt": {
"greeting": "{{user.greeting}}",
"greeting_interruptible": "{{user.greeting_interruptible}}"
}
}Voice Configuration
Language-Specific Voices
Configure TTS with voice mappings per language:
{
"tts": {
"provider": "cartesia",
"model": "sonic-3",
"voice_id": "a167e0f3-df7e-4e3e-a5b4-3b3a6f3b8b3a",
"voices": {
"hi": "4cda7a46-e225-445d-8b18-0519d58ce5c0",
"en": "b7d50908-b17c-442d-ad8d-810c63997ed9",
"ta": "d4b3c8a2-f123-4567-8901-abcdef123456",
"te": "e5c4d9b3-g234-5678-9012-bcdefg234567"
}
}
}Dynamic Voice Selection
The voice_id can also be a template:
{
"tts": {
"voice_id": "{{user.voice_id}}"
}
}Backend provides the voice:
user_context = {
"voice_id": get_voice_for_language(user.language)
}STT Configuration
Multi-Language Detection
Use multilingual models for automatic language detection:
{
"stt": {
"provider": "deepgram",
"model": "nova-3",
"language": "multi"
}
}Fixed Language
For single-language agents:
{
"stt": {
"provider": "deepgram",
"model": "nova-3",
"language": "hi"
}
}Localized Messages
Closing Messages
Pre-compute localized system messages:
CLOSING_MESSAGES = {
"hi": "आज लॉग करने के लिए धन्यवाद! जैसे ही आपका भोजन फीडबैक तैयार होगा, आपको नोटिफिकेशन मिलेगी!",
"en": "Thanks for logging today! You'll receive a notification once your meal feedback is ready!",
"ta": "இன்று பதிவு செய்ததற்கு நன்றி! உங்கள் உணவு பின்னூட்டம் தயாரானதும் அறிவிப்பு வரும்!",
"te": "ఈ రోజు లాగ్ చేసినందుకు ధన్యవాదాలు! మీ భోజన ఫీడ్బ్యాక్ సిద్ధమైనప్పుడు మీకు నోటిఫికేషన్ వస్తుంది!",
"bn": "আজ লগ করার জন্য ধন্যবাদ! আপনার খাবারের ফিডব্যাক তৈরি হলে আপনি নোটিফিকেশন পাবেন!",
}
def get_closing_message(language: str) -> str:
return CLOSING_MESSAGES.get(language, CLOSING_MESSAGES["hi"])In Prompts
CALL END SCRIPT:
When all meals are logged, say exactly:
"{{user.closing_message}}"Complete Example
Single Meal Coach Agent
A single meal-coach-agent handles both new and returning users. The backend determines the appropriate greeting and interruptibility.
Agent Configuration
{
"name": "meal-coach-agent",
"config": {
"prompt": {
"system": "You are Tap Health Coach.\n\nLANGUAGE: Speak ONLY in {{user.language_name}}. All responses in {{user.language_name}}.\n\nCONTEXT:\n- User: {{user.name}}\n- Pending meals: {{user.pending_meals_display}}\n- Logged meals: {{user.logged_meals_display}}\n\nVOICE STYLE:\n- Warm, friendly\n- Short sentences (8-12 words)\n- Use familiar food terms",
"greeting": "{{user.greeting}}",
"greeting_interruptible": "{{user.greeting_interruptible}}",
"variables": [
{ "name": "user.name", "default": "there" },
{ "name": "user.language_name", "default": "Hindi" },
{ "name": "user.greeting", "default": "Namaste!" },
{ "name": "user.greeting_interruptible", "default": true }
]
},
"stt": {
"provider": "deepgram",
"model": "nova-3",
"language": "multi"
},
"tts": {
"provider": "cartesia",
"model": "sonic-3",
"voice_id": "default-voice",
"voices": {
"hi": "hindi-voice-id",
"en": "english-voice-id",
"ta": "tamil-voice-id"
}
},
"llm": {
"provider": "gemini",
"model": "gemini-2.0-flash-exp"
},
"languages": ["hi", "en", "ta", "te"],
"default_language": "hi"
}
}Backend Call Dispatch
async def dispatch_meal_logging_call(user_id: str):
user = await get_user(user_id)
language = user.preferred_language or "hi"
is_new_user = user.call_count == 0
# Get pending/logged meals
pending_meals = await get_pending_meals(user_id)
logged_meals = await get_logged_meals(user_id)
# Determine greeting based on user type
if is_new_user:
greeting = get_welcome_script(user.name, language)
greeting_interruptible = False # Full onboarding, don't interrupt
else:
greeting = generate_returning_greeting(language, user.name, pending_meals[0])
greeting_interruptible = True # Quick greeting, can interrupt
# Build language-aware context
user_context = {
"name": user.name,
"language": language,
"language_name": LANGUAGE_NAMES[language],
"is_new_user": is_new_user,
# Dynamic greeting with interruptibility
"greeting": greeting,
"greeting_interruptible": greeting_interruptible,
# Pre-computed display strings
"closing_message": get_closing_message(language),
# Meal data
"pending_meals": pending_meals,
"pending_meals_display": ", ".join(pending_meals) or "none",
"logged_meals": logged_meals,
"logged_meals_display": ", ".join(logged_meals) or "none",
# Voice selection
"voice_id": get_voice_for_language(language)
}
response = await client.post("/api/v1/calls", json={
"workflow_slug": "meal-logging",
"user_id": user_id,
"user_context": user_context
})
return response.json()Best Practices
1. Pre-compute Critical Messages
Don't rely on LLM to generate greetings/closings in correct language. Pre-compute them:
user_context = {
"greeting": generate_greeting(language, name),
"closing_message": get_closing_message(language),
"error_message": get_error_message(language)
}2. Use Strong Language Instructions
Be explicit in prompts:
LANGUAGE: Speak ONLY in {{user.language_name}}.
This is MANDATORY. ALL responses must be in {{user.language_name}}.3. Match Voice to Language
Always map voices to languages:
{
"voices": {
"hi": "hindi-voice-id",
"ta": "tamil-voice-id"
}
}4. Use Multilingual STT
Let Deepgram auto-detect:
{
"stt": {
"language": "multi"
}
}5. Test Each Language
Test all supported languages:
- Greeting pronunciation
- Response language consistency
- Voice naturalness
- Cultural appropriateness
Multi-Tenant Considerations
Each tenant can have different:
- Supported languages
- Voice mappings
- Greeting templates
- Closing messages
Store tenant-specific language configs:
# Tenant settings
tenant.settings = {
"supported_languages": ["hi", "en", "ta"],
"default_language": "hi",
"voice_mappings": {
"hi": "tenant-hindi-voice",
"en": "tenant-english-voice"
},
"greeting_templates": {
"hi": "नमस्ते {name}! {company_name} से बोल रहे हैं।",
"en": "Hi {name}! This is {company_name}."
}
}Troubleshooting
Agent responds in wrong language
- Check
user.language_nameis being passed - Verify prompt has strong language instructions
- Check variable defaults
Voice doesn't match language
- Check
voicesmapping has the language - Verify voice_id for that language is correct
- Check TTS provider supports the language
Greeting not in user's language
Use pre-computed greeting instead of LLM:
{
"greeting": "{{user.greeting}}"
}With backend providing:
user_context["greeting"] = generate_greeting(language, name)