Agent Studio
Adr

ADR-002: Provider Abstraction

Protocol-based provider system for STT/TTS/LLM/VAD

ADR-002: Provider Abstraction

Status

Accepted

Context

Agent Studio supports multiple providers for each capability:

  • STT: Deepgram, Sarvam
  • TTS: Cartesia, Sarvam
  • LLM: Gemini, OpenAI
  • VAD: Silero

Each tenant can bring their own API keys (BYOK) and agents can override provider settings. We need a clean abstraction that:

  • Makes adding new providers easy
  • Supports runtime provider selection
  • Works with LiveKit Agents SDK
  • Handles credential management

Options Considered

  1. Abstract Base Classes (ABC) - Traditional Python approach
  2. Protocol classes - Structural typing, more flexible
  3. Duck typing - No formal interface, fragile
  4. Plugin system - Complex, overkill for our needs

Decision

We will use Protocol classes with a Registry pattern:

# providers/base.py
from typing import Protocol, Any

class STTProvider(Protocol):
    """Speech-to-Text provider interface."""
    
    def create(self, *, language: str = "en", **kwargs) -> Any:
        """Create LiveKit-compatible STT instance."""
        ...

class TTSProvider(Protocol):
    """Text-to-Speech provider interface."""
    
    def create(self, *, voice_id: str, language: str = "en", **kwargs) -> Any:
        """Create LiveKit-compatible TTS instance."""
        ...

class LLMProvider(Protocol):
    """Large Language Model provider interface."""
    
    def create(self, *, model: str, temperature: float = 0.7, **kwargs) -> Any:
        """Create LiveKit-compatible LLM instance."""
        ...
# providers/registry.py
class ProviderRegistry:
    _stt: dict[str, type[STTProvider]] = {}
    _tts: dict[str, type[TTSProvider]] = {}
    _llm: dict[str, type[LLMProvider]] = {}
    
    @classmethod
    def register_stt(cls, name: str):
        def decorator(provider_cls):
            cls._stt[name] = provider_cls
            return provider_cls
        return decorator
    
    @classmethod
    def get_stt(cls, name: str, api_key: str) -> STTProvider:
        if name not in cls._stt:
            raise ProviderNotFoundError(f"Unknown STT provider: {name}")
        return cls._stt[name](api_key=api_key)
# providers/stt/deepgram.py
@ProviderRegistry.register_stt("deepgram")
class DeepgramSTT:
    def __init__(self, api_key: str):
        self.api_key = api_key
    
    def create(self, *, language: str = "en", **kwargs) -> Any:
        from livekit.plugins import deepgram
        return deepgram.STT(api_key=self.api_key, language=language)

Provider Inheritance Chain

Platform Env → Tenant BYOK → Agent Override → Runtime

Resolution logic:

  1. Check agent config for provider override
  2. Fall back to tenant's default for that provider type
  3. Fall back to platform environment variables

Consequences

Positive

  • Clean separation of interface and implementation
  • Easy to add new providers (just implement Protocol)
  • No inheritance hierarchy to maintain
  • Works with mypy static type checking
  • Runtime provider selection based on configuration

Negative

  • Protocol requires Python 3.8+ (not an issue with 3.11+)
  • Less IDE support compared to ABC
  • No runtime interface checking (only static)

Neutral

  • Registry is a singleton (simple but limits testing flexibility)
  • Provider instances are created per-call (stateless)

On this page