Adr
ADR-012: Self-Hosted LiveKit with SIP Telephony
Decision to self-host LiveKit server with SIP capability for PSTN calls
ADR-012: Self-Hosted LiveKit with SIP Telephony
Status
Accepted
Context
Agent Studio needs to support outbound phone calls (PSTN) in addition to WebRTC-based in-app calls. This requires SIP (Session Initiation Protocol) integration to bridge voice AI agents with traditional phone networks.
Options considered:
- LiveKit Cloud - Managed LiveKit service with built-in SIP support
- Self-Hosted LiveKit + SIP - Deploy LiveKit server and SIP service on own infrastructure
- Hybrid - Use LiveKit Cloud for WebRTC, separate SIP gateway for telephony
Requirements:
- Outbound calls to phone numbers via PSTN
- Integration with Twilio for SIP trunking
- Low latency for real-time voice conversations
- Cost-effective at scale
- Full control over infrastructure
Decision
Self-host LiveKit server with the LiveKit SIP service for full telephony capability.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Self-Hosted Infrastructure │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ LiveKit │◄────►│ SIP │◄────►│ Workers │ │
│ │ Server │ │ Server │ │ (Agents) │ │
│ │ │ │ │ │ │ │
│ │ WebRTC │ │ SIP/RTP │ │ LiveKit │ │
│ │ Signaling │ │ Gateway │ │ Agent SDK │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │ │ │
│ └─────────────────────┼─────────────────────────────────┘
│ │
└───────────────────────────────┼─────────────────────────────────┘
│ SIP Trunk
▼
┌─────────────────────────────────────────────────────────────────┐
│ TWILIO │
│ Elastic SIP Trunking │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Outbound │ │ Inbound │ │ Phone │ │
│ │ Trunk │ │ Trunk │ │ Numbers │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
PSTN NetworkCall Types
| Type | Protocol | Description |
|---|---|---|
| VoIP | WebRTC | User connects via browser/app microphone |
| SIP | SIP/RTP | Outbound call to phone number via Twilio |
Port Requirements
| Port | Protocol | Service | Firewall |
|---|---|---|---|
| 7880 | TCP | LiveKit WebSocket | Internal (via reverse proxy) |
| 7881 | TCP | LiveKit RTC over TCP | Open to internet |
| 7882 | UDP | LiveKit TURN/TLS | Open to internet |
| 5060 | UDP/TCP | SIP Signaling | Open to internet |
| 10000-20000 | UDP | RTP Media | Open to internet |
Implementation
Docker Compose Services
# LiveKit Server
livekit:
image: livekit/livekit-server:latest
network_mode: host
volumes:
- ./livekit.yaml:/etc/livekit.yaml:ro
command: --config /etc/livekit.yaml
# SIP Server
sip:
image: livekit/sip:latest
network_mode: host
environment:
- SIP_API_KEY=${LIVEKIT_API_KEY}
- SIP_API_SECRET=${LIVEKIT_API_SECRET}
- SIP_WS_URL=ws://localhost:7880
- SIP_PORT=5060
- SIP_RTP_PORT=10000-20000
- SIP_USE_EXTERNAL_IP=trueLiveKit Configuration
# livekit.yaml
port: 7880
rtc:
port_range_start: 7882
port_range_end: 7882
tcp_port: 7881
use_external_ip: true
redis:
address: localhost:6379
logging:
level: infoTwilio SIP Trunk Setup
-
Create SIP Trunk in Twilio Console
- Domain:
agent-studio.pstn.twilio.com
- Domain:
-
Configure Termination (Outbound)
- Create credential list for authentication
- Associate with trunk
-
Configure Origination (Inbound)
- Point to VM's public IP:
sip:34.100.161.167:5060
- Point to VM's public IP:
-
Create Outbound Trunk in LiveKit
{
"trunk": {
"name": "Twilio Outbound",
"address": "agent-studio.pstn.twilio.com",
"numbers": ["+91XXXXXXXXXX"],
"auth_username": "<credential-username>",
"auth_password": "<credential-password>"
}
}SIP Call Flow
- Backend calls Agent Studio API with
call_type: "sip"andphone_number - API creates LiveKit room and dispatches worker
- Worker receives job with phone number in metadata
- Worker calls
CreateSIPParticipantRequestwith:sip_trunk_id: Outbound trunk IDsip_call_to: Phone number (E.164 format)room_name: LiveKit room name
- LiveKit SIP server initiates outbound call via Twilio
- When user answers, audio flows through LiveKit room
- Agent interacts with user normally
Consequences
Positive
- Full control: Own infrastructure, no vendor lock-in
- Cost effective: No per-minute LiveKit Cloud charges at scale
- Low latency: Co-located services reduce network hops
- Data sovereignty: All voice data stays on your infrastructure
- Customization: Can tune LiveKit/SIP configuration
Negative
- Operational complexity: Must manage LiveKit and SIP servers
- Port management: Need to open and secure multiple ports
- Scaling: Manual scaling of LiveKit infrastructure
- Monitoring: Must set up own observability
Mitigations
- Use Docker Compose for consistent deployments
- Document port requirements and firewall rules
- Monitor with Prometheus metrics from LiveKit
- Keep LiveKit Cloud as fallback option
Alternatives Considered
LiveKit Cloud
Pros:
- Zero infrastructure management
- Built-in SIP support
- Global edge network
Cons:
- Per-minute pricing at scale
- Less control over configuration
- Data leaves your infrastructure
Decision: Not chosen due to cost at scale and desire for full control.
Separate SIP Gateway (Kamailio/OpenSIPS)
Pros:
- Industry-standard SIP servers
- Highly configurable
Cons:
- Additional complexity
- Requires SIP expertise
- More components to manage
Decision: Not chosen. LiveKit SIP service provides sufficient functionality with simpler setup.