Production Deployment
Production deployment checklist and best practices
Production Deployment
This guide covers best practices and a checklist for deploying Agent Studio to production with self-hosted LiveKit and SIP telephony.
Pre-Deployment Checklist
Security
- Generate secure
JWT_SECRET(min 32 characters) - Generate secure
ENCRYPTION_KEY(exactly 32 bytes) - Generate secure
LIVEKIT_API_KEYandLIVEKIT_API_SECRET - Use HTTPS for all endpoints
- Configure CORS origins (don't use
*in production) - Set up rate limiting
- Enable audit logging
- Review API key permissions
Infrastructure
- PostgreSQL 16+ with SSL enabled
- Redis 7+ with authentication
- Self-hosted LiveKit server
- LiveKit SIP server (for phone calls)
- Load balancer with health checks
- DNS and SSL certificates
- Firewall rules for SIP/RTP ports
Telephony (SIP)
- Twilio account with Elastic SIP Trunking
- Phone numbers purchased in Twilio
- Outbound SIP trunk configured
- Firewall ports open (5060, 10000-20000)
Monitoring
- Application metrics (Prometheus)
- Log aggregation
- Error tracking (Sentry)
- Uptime monitoring
- Alerting configured
Environment Configuration
Required Variables
# Database (use connection pooling in production)
DATABASE_URL=postgresql+asyncpg://user:password@host:5432/agent_studio?ssl=require
# Redis (with auth)
REDIS_URL=redis://:password@host:6379/0
# Security - GENERATE SECURE VALUES!
JWT_SECRET=generate-a-secure-random-string-at-least-32-chars
ENCRYPTION_KEY=exactly-32-bytes-for-aes256-key!
# LiveKit (self-hosted)
LIVEKIT_URL=ws://localhost:7880
LIVEKIT_API_KEY=your-generated-api-key
LIVEKIT_API_SECRET=your-generated-api-secret
# SIP Telephony
LIVEKIT_SIP_TRUNK_ID=your-outbound-trunk-id
# Application
LOG_LEVEL=INFO
CORS_ORIGINS=https://dashboard.yourdomain.com,https://app.yourdomain.comGenerating Secure Keys
# JWT Secret (64 characters)
openssl rand -base64 48
# Encryption Key (32 bytes, base64)
openssl rand -base64 32
# LiveKit API Key (recommended format: APIxxxxxxxx)
echo "API$(openssl rand -hex 8)"
# LiveKit API Secret
openssl rand -base64 32Architecture
Self-Hosted LiveKit + SIP Setup
┌─────────────────────────────────────────────────────────────────┐
│ Your Infrastructure │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ │
│ │ Load Balancer│ │
│ │ (Caddy) │ │
│ │ :80/:443 │ │
│ └──────┬───────┘ │
│ │ │
│ ┌──────┼──────────────────────────────────────┐ │
│ │ │ │ │
│ │ ┌───▼───┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ API │ │ Worker │ │ Worker │ │ │
│ │ │ :8000 │ │ #1 │ │ #2 │ │ │
│ │ └───────┘ └────┬────┘ └────┬────┘ │ │
│ │ │ │ │ │
│ │ ┌───────────────┼────────────┘ │ │
│ │ │ │ │ │
│ │ │ ┌────────────▼────────────┐ │ │
│ │ │ │ LiveKit │ │ │
│ │ │ │ :7880 (WS) :7881 (TCP) │ │ │
│ │ │ │ :7882 (UDP) │ │ │
│ │ │ └────────────┬────────────┘ │ │
│ │ │ │ │ │
│ │ │ ┌────────────▼────────────┐ │ │
│ │ │ │ SIP Server │ │ │
│ │ │ │ :5060 (SIP signaling) │ │ │
│ │ │ │ :10000-20000 (RTP) │ │ │
│ │ │ └────────────┬────────────┘ │ │
│ │ │ │ │ │
│ │ └───────────────┼──────────────────────────┘ │
│ │ │ │
│ │ ┌───────────────┼───────────────┐ │
│ │ │ │ │ │
│ │ ▼ ▼ │ │
│ │ ┌────────┐ ┌─────────┐ │ │
│ │ │ Redis │ │PostgreSQL│ │ │
│ │ │ :6379 │ │ :5432 │ │ │
│ │ └────────┘ └─────────┘ │ │
│ │ │ │
│ └──────────────────────────────────┘ │
│ │
└──────────────────────────┬──────────────────────────────────────┘
│
│ SIP/RTP (ports 5060, 10000-20000)
▼
┌─────────────────────────────────────────────────────────────────┐
│ TWILIO │
│ Elastic SIP Trunking │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Outbound │ │ Inbound │ │ Phone │ │
│ │ Trunk │ │ Trunk │ │ Numbers │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
PSTN NetworkPort Requirements
| Port | Protocol | Service | Firewall |
|---|---|---|---|
| 80/443 | TCP | HTTPS (Caddy) | Public |
| 7881 | TCP | LiveKit RTC over TCP | Public |
| 7882 | UDP | LiveKit TURN/TLS | Public |
| 5060 | UDP/TCP | SIP Signaling | Public |
| 10000-20000 | UDP | RTP Media | Public |
Scaling Guidelines
| Component | Scaling Strategy |
|---|---|
| API | Horizontal (stateless) |
| Worker | Horizontal (1 worker per ~10 concurrent calls) |
| PostgreSQL | Vertical + read replicas |
| Redis | Cluster mode |
| LiveKit | Vertical (single instance for moderate load) |
| SIP | Vertical (single instance) |
LiveKit Configuration
livekit.yaml
port: 7880
rtc:
port_range_start: 7882
port_range_end: 7882
tcp_port: 7881
use_external_ip: true
redis:
address: localhost:6379
keys:
APIxxxxxxxx: your-api-secret-here
logging:
level: info
json: trueSIP Server Environment
SIP_API_KEY=${LIVEKIT_API_KEY}
SIP_API_SECRET=${LIVEKIT_API_SECRET}
SIP_WS_URL=ws://localhost:7880
SIP_PORT=5060
SIP_RTP_PORT=10000-20000
SIP_REDIS_HOST=localhost:6379
SIP_USE_EXTERNAL_IP=true
SIP_LOGGING_LEVEL=infoTwilio SIP Trunk Setup
1. Create SIP Trunk
twilio api trunking v1 trunks create \
--friendly-name "Agent Studio Production" \
--domain-name "agent-studio-prod.pstn.twilio.com"2. Configure Termination (Outbound)
In Twilio Console:
- Go to Voice > Credential Lists
- Create new credential list with username/password
- Go to Elastic SIP Trunking > Your Trunk > Termination
- Add credential list to Authentication
3. Configure Origination (Inbound)
twilio api trunking v1 trunks origination-urls create \
--trunk-sid <trunk_sid> \
--friendly-name "Agent Studio SIP" \
--sip-url "sip:<your-vm-ip>:5060" \
--weight 1 --priority 1 --enabled4. Create Outbound Trunk in LiveKit
{
"trunk": {
"name": "Twilio Production",
"address": "agent-studio-prod.pstn.twilio.com",
"numbers": ["+91XXXXXXXXXX"],
"auth_username": "<twilio-credential-username>",
"auth_password": "<twilio-credential-password>"
}
}lk sip outbound create outbound-trunk.json
# Save the returned trunk ID as LIVEKIT_SIP_TRUNK_IDDatabase
Connection Pooling
Use PgBouncer or built-in pool:
# In settings
DATABASE_POOL_SIZE=20
DATABASE_POOL_OVERFLOW=10Migrations
Run migrations before deploying new versions:
# Using Alembic
alembic upgrade headBackups
- Enable automated daily backups
- Test restore procedures
- Keep 30 days of backups minimum
Monitoring
Prometheus Metrics
Agent Studio exposes metrics at /metrics:
# API metrics
http_requests_total{method="GET", endpoint="/api/v1/agents", status="200"}
http_request_duration_seconds{method="GET", endpoint="/api/v1/agents"}
# Voice metrics
voice_calls_total{workflow="daily-call", status="completed", call_type="sip"}
voice_call_duration_seconds{workflow="daily-call"}
# SIP metrics
sip_calls_total{direction="outbound", status="answered"}
sip_call_setup_seconds{direction="outbound"}
# Provider metrics
provider_latency_seconds{provider="deepgram", type="stt"}Logging
Structured JSON logging is enabled by default:
{
"timestamp": "2026-01-17T10:30:00Z",
"level": "INFO",
"message": "Call completed",
"call_id": "uuid",
"call_type": "sip",
"phone_number": "+91XXXXXXXXXX",
"duration": 180,
"request_id": "req-uuid"
}Health Checks
Configure your load balancer to use:
- Liveness:
GET /health/live(container running) - Readiness:
GET /health/ready(ready for traffic)
Security Hardening
API Security
- Rate Limiting: Configure per-key limits
- Request Validation: All inputs are validated
- SQL Injection: Prevented via SQLAlchemy ORM
- XSS: API-only, no HTML rendering
Network Security
- TLS 1.3: Use modern TLS
- Private Networks: Keep DB/Redis internal
- Firewall Rules: Restrict access by IP
- SIP Security: Use TLS for SIP signaling when possible
SIP Security
- Use strong credentials for SIP trunk authentication
- Limit outbound calling to verified numbers
- Monitor for unusual call patterns
- Set rate limits on SIP calls
Secret Management
Use a secrets manager:
- AWS Secrets Manager
- HashiCorp Vault
- Kubernetes Secrets
# Example: AWS Secrets Manager
DATABASE_URL=$(aws secretsmanager get-secret-value --secret-id prod/db-url --query SecretString --output text)High Availability
API Layer
- Run 3+ instances behind load balancer
- Use sticky sessions for WebSocket (if needed)
- Health check interval: 10s
- Unhealthy threshold: 3
Worker Layer
- Run workers equal to max concurrent calls / 10
- Workers are stateless (use Redis for state)
- Auto-scaling based on queue depth
LiveKit Layer
- Single LiveKit instance handles moderate load
- For high load: Consider LiveKit Cloud or multiple regions
- Monitor media server CPU and memory
Database
- PostgreSQL with streaming replication
- Automatic failover (patroni, RDS Multi-AZ)
- Connection pooling
Deployment Strategies
Blue-Green Deployment
- Deploy new version to "green" environment
- Run smoke tests
- Switch load balancer to green
- Keep blue as rollback
Rolling Deployment
- Update instances one at a time
- Wait for health checks to pass
- Continue to next instance
- Automatic rollback on failures
Canary Deployment
- Route 5% traffic to new version
- Monitor error rates and latency
- Gradually increase to 100%
- Rollback if issues detected
Troubleshooting
Common Issues
High Latency
- Check database query performance
- Review provider API latencies
- Check Redis connectivity
Connection Errors
- Verify connection pool settings
- Check network connectivity
- Review firewall rules
SIP Call Failures
- Check SIP server logs:
docker logs sip - Verify trunk credentials in Twilio
- Check firewall allows SIP/RTP ports
- Review Twilio call logs in console
Memory Issues
- Monitor container memory usage
- Check for memory leaks in workers
- Adjust resource limits
Debug Mode
Enable debug logging temporarily:
LOG_LEVEL=DEBUGNever enable debug logging in production for extended periods - it impacts performance and may log sensitive data.
Maintenance
Regular Tasks
| Task | Frequency |
|---|---|
| Rotate API keys | Quarterly |
| Rotate SIP credentials | Quarterly |
| Update dependencies | Monthly |
| Review access logs | Weekly |
| Test disaster recovery | Quarterly |
| Security audit | Annually |
Updates
- Review changelog for breaking changes
- Test in staging environment
- Schedule maintenance window
- Deploy with rollback plan
- Monitor for issues