Convonet Voice AI Productivity System

Google Cloud Run · FastAPI microservices · WebRTC/WebSocket voice (no LiveKit) · Multi-LLM (Claude, Gemini, OpenAI) · Domain Agents (Productivity, Mortgage, Healthcare, Restaurant/Hanok) · MCP Tools · Agent Monitor · Call Transfer (Twilio/FusionPBX) · Sentry

FastAPI Google Cloud Run LangGraph MCP (38 Tools) LangChain Claude · Gemini · OpenAI WebSocket Voice Google Calendar Twilio Voice Agent Monitor Team Collaboration FusionPBX Deepgram STT/TTS ElevenLabs TTS Cartesia TTS Sentry Redis Composio (opt)

Technical Architecture

System Architecture Overview

Complete System Flow Diagrams

System Architecture Diagram

Complete system flow overview with all components and their relationships

View Full Diagram →
Sequence Diagram

Step-by-step flow (52 steps) showing interactions between all components

View Full Diagram →
Voice Flow (GCP): Browser → voice-gateway WebSocket (/webrtc/ws) → PIN Auth (PostgreSQL or env) → Batch STT (Deepgram) → agent-llm-service HTTP (/agent/process) → LangGraph · Multi-LLM · MCP Tools → TTS (Deepgram) → transcript_final, agent_final, audio_chunk → User. No LiveKit on GCP.
Transfer Flow: User request → agent-llm (transfer intent) → voice-gateway (Twilio) → FusionPBX (Extension 2001) → Agent Dashboard (JsSIP at /call-center) → Live conversation

Google Cloud Run Microservices

Convonet runs as five FastAPI microservices on Google Cloud Run, exposed under a single domain v2.convonetai.com (plus optional Hanok hostname) with path-based routing.

  • voice-gateway-service: WebSocket /webrtc/ws, Twilio webhooks; STT → agent-llm HTTP → TTS (no LiveKit).
  • agent-llm-service: POST /agent/process, LangGraph, Multi-LLM, MCP tools, intent routing (Todo/Mortgage/Healthcare/Hanok).
  • call-center-service: Landing, /call-center, /voice_assistant, /mortgage_dashboard, /agent-monitor, /tool-execution, doc pages.
  • crm-integration-service: SuiteCRM (patient, meeting, case, note).
  • hanok-table-service: Restaurant reservation APIs/webhooks (/api/reservations/*, /webhooks/hanok_table/*).

Path-Based Routing (v2.convonetai.com)

/ , /call-center, /voice_assistant, /agent-monitor, ... call-center-service
/webrtc/* , /twilio/* voice-gateway-service
/agent/* , /convonet_todo/* agent-llm-service
/patient/* , /meeting/create, ... crm-integration-service
hanok.convonetai.com (or /hanok/*) hanok-table-service
Single domain · no CORS
Monitoring: All Operations → Sentry Monitoring → Real-time Alerts & Performance Tracking

Architecture Flow: Five FastAPI microservices on Google Cloud Run (v2.convonetai.com + hanok.convonetai.com). Voice via WebSocket to voice-gateway (no LiveKit): PIN auth (PostgreSQL or env), batch STT/TTS (Deepgram), HTTP to agent-llm for LangGraph and multi-LLM (Claude, Gemini, OpenAI), domain agents (Productivity, Mortgage, Healthcare, Hanok), MCP tools, and Hanok reservation APIs on hanok-table-service. Agent Monitor (tool calls, voice_timing) and call transfer (Twilio → FusionPBX). Sentry for monitoring.

View System Architecture Diagram | View Sequence Diagram

Overview

The Convonet Voice AI Productivity System is an enterprise-grade platform running on Google Cloud Run as five FastAPI microservices. It combines multi-LLM AI (Claude, Gemini, OpenAI), WebSocket-based voice (no LiveKit on GCP), domain-specific agents (Productivity, Mortgage, Healthcare, Restaurant/Hanok), and intelligent call center integration. Voice flows through the voice-gateway WebSocket at /webrtc/ws: PIN authentication (PostgreSQL or env), batch STT/TTS (Deepgram), and HTTP to agent-llm for LangGraph and MCP tools.

The system enables teams to manage todos, mortgage workflows, healthcare intakes, and Hanok restaurant reservations via voice and web dashboards, with seamless transfer to human agents (Twilio → FusionPBX). Agent Monitor shows tool calls and voice response timing. Services are exposed under v2.convonetai.com (and optionally hanok.convonetai.com) with path-based routing/custom-domain mapping; Redis and PostgreSQL (e.g. Render) back session and application data.

Core Technologies

  • • FastAPI microservices on Google Cloud Run
  • • LangGraph for agent orchestration; Multi-LLM: Claude, Gemini, OpenAI
  • • Model Context Protocol (MCP) – domain tools (todo, mortgage, healthcare, hanok reservations, transfer)
  • • WebSocket voice: voice-gateway at /webrtc/ws (no LiveKit on GCP)
  • • Deepgram batch STT and TTS; optional ElevenLabs, Cartesia
  • • PostgreSQL (e.g. Render): users_anthropic, todos, mortgage
  • • PIN authentication (users_anthropic.voice_pin or VOICE_PIN env)
  • • Agent Monitor: tool calls and voice_timing (Redis-backed)
  • • Twilio and FusionPBX for call transfer; JsSIP at /call-center
  • • Single domain v2.convonetai.com with path-based routing
  • • Sentry error monitoring

Key Features

  • • Domain agents: Productivity (Todo), Mortgage, Healthcare (intent-based routing)
  • • WebSocket voice pipeline: STT → agent-llm HTTP → TTS; metadata (voice_timing) for Agent Monitor
  • • Agent Monitor: real-time tool calls and voice response timing (call-center-service)
  • • PIN-based voice authentication (DB or env)
  • • Call transfer: AI → Twilio → FusionPBX Extension 2001 → JsSIP dashboard
  • • 38 MCP tools; mortgage dashboard and tool-execution dashboard
  • • Call center UI at /call-center (SIP config via SIP_DOMAIN, SIP_WSS_PORT)
  • • CRM integration service for SuiteCRM (patient, meeting, case, note)
  • • Production: Cloud Run with scale-to-zero; Redis and PostgreSQL external

Recent Updates & Improvements (February 2026)

GCP Cloud Run Microservices

  • ✓ Five FastAPI services: voice-gateway, agent-llm, call-center, crm-integration, hanok-table
  • ✓ Single domain v2.convonetai.com with path-based routing
  • ✓ Voice via WebSocket /webrtc/ws (no LiveKit on GCP)
  • ✓ Agent-llm: /agent/process, LangGraph, MCP tools
  • ✓ Scale-to-zero; Redis and PostgreSQL (e.g. Render) external
Files: voice_gateway_service.py, agent_llm_service.py, call_center_service.py, cloudbuild.yaml

Call Transfer to FusionPBX

  • ✓ Seamless AI → Human agent transfer
  • ✓ FusionPBX extension 2001 integration
  • ✓ SIP/WSS connectivity (Google Cloud VM)
  • ✓ Transfer detection via phrases or tool
  • ✓ Department routing (support, sales, etc.)
Files: call_transfer.py, CALL_TRANSFER_GUIDE.md

WebSocket Voice + Agent Monitor

  • ✓ Voice-gateway WebSocket /webrtc/ws (no LiveKit on GCP)
  • ✓ Batch STT (Deepgram) → agent-llm HTTP → TTS (Deepgram)
  • ✓ Metadata (voice_timing, source) sent to agent for Agent Monitor
  • ✓ Domain agents: Productivity, Mortgage, Healthcare, Restaurant/Hanok (intent routing)
  • ✓ Agent Monitor: tool calls and voice response timing (Redis)
Files: voice_gateway_service.py, agent_monitor.py, routes.py, deepgram/

Composio (optional)

  • Optional external tools (Slack, GitHub, Gmail, Notion, etc.)
  • Not required for GCP core voice/agent/call-center flows

Sentry Integration

  • ✓ Real-time error tracking & alerts
  • ✓ Performance monitoring (agent processing time)
  • ✓ User context & session tracking
  • ✓ Timeout & thread reset tracking
  • ✓ Production-grade observability
Integration: FastAPI (agent-llm, voice-gateway) + Logging

Timeout Optimization

  • ✓ Tool timeout: 8s (from 20s)
  • ✓ Agent timeout: 10s (from 25s)
  • ✓ Webhook timeout: 12s (from 30s)
  • ✓ Stays under Twilio's 15s HTTP limit
  • ✓ Thread reset on timeout prevents errors
Result: 95%+ operations complete successfully

WebRTC Call Center

  • ✓ JsSIP v3.10.1 browser softphone
  • ✓ WebSocket Secure (WSS) on port 7443
  • ✓ Agent dashboard with SIP registration
  • ✓ Call control (answer, hold, transfer, hangup)
  • ✓ Google Cloud firewall configured
Platform: FusionPBX 34.26.59.14 (GCP VM)

Automatic Error Recovery

  • ✓ Thread reset with timestamped IDs
  • ✓ BrokenResourceError handling
  • ✓ tool_call_id incomplete error recovery
  • ✓ In-memory reset tracking (_reset_threads)
  • ✓ No cascading failures
Benefit: Self-healing conversation threads

Performance Optimization

  • ✓ Removed Google Calendar sync delay
  • ✓ Simplified JSON responses (no MCP breaks)
  • ✓ Agent processing time measurement
  • ✓ Transaction tracking per voice call
  • ✓ Custom Sentry metrics & measurements
Result: Sub-5s response times
2001
FusionPBX Extension
Deepgram
STT API
38
MCP Tools
12s
Max Response Time
100%
Sentry Trace Rate

GCP WebSocket Voice Architecture

On Google Cloud Run, voice uses FastAPI WebSocket only—no LiveKit. The browser connects to voice-gateway-service at /webrtc/ws. After PIN auth (PostgreSQL or env), the pipeline runs: batch STT (Deepgram) → HTTP POST to agent-llm-service → TTS (Deepgram) → transcript_final, agent_final, and audio_chunk back to the client. Domain agents (Productivity, Mortgage, Healthcare, Hanok) and Agent Monitor (tool calls, voice_timing) are unchanged.

Try Voice Assistant →

/voice_assistant  · WebSocket: /webrtc/ws (voice-gateway-service)

Voice Assistant Architecture (GCP)

WebSocket Voice Processing Flow
🌐
Browser
getUserMedia
📡
Voice Gateway
WebSocket /webrtc/ws
🔐
PIN Auth
PostgreSQL / env
🎧
Deepgram
Batch STT
🧠
Agent LLM
HTTP /agent/process
🔊
Deepgram
TTS
🌐
Browser
Playback

Flow: Browser → voice-gateway WebSocket → PIN Auth (PostgreSQL or env) → Batch STT (Deepgram) → agent-llm-service HTTP → LangGraph · Multi-LLM · MCP Tools → TTS (Deepgram) → transcript_final, agent_final, audio_chunk → User. No LiveKit on GCP.

Phase 1: Authentication

WebSocket connect, PIN auth (PostgreSQL or VOICE_PIN env)

Phase 2: Conversation Loop

Record → Batch STT → agent-llm HTTP → TTS → playback

Phase 3: Transfer Request

User requests transfer; agent-llm returns transfer_marker

Phase 4: Twilio Transfer

Voice-gateway → Twilio → FusionPBX → Agent Dashboard (JsSIP)

GCP architecture: Voice uses FastAPI WebSocket to voice-gateway-service only. No LiveKit, no Socket.IO. PIN (PostgreSQL or env), batch STT/TTS (Deepgram), HTTP to agent-llm for LangGraph and MCP tools. Agent Monitor (tool calls, voice_timing) and transfer to FusionPBX via Twilio unchanged.

Component Interaction (GCP)
Component Input From Output To Purpose
User BrowserUser voiceVoice Gateway (WS)Captures audio (MediaRecorder), displays UI
Voice GatewayBrowser, Deepgram, Agent LLMBrowser, Deepgram, Agent LLMWebSocket /webrtc/ws, STT→agent→TTS pipeline
PIN AuthVoice GatewayPostgreSQL (or env)Validates PIN (users_anthropic or VOICE_PIN)
Deepgram STT/TTSVoice GatewayVoice GatewayBatch transcription and TTS
Agent LLM ServiceVoice Gateway (HTTP)Voice Gateway, RedisLangGraph, Multi-LLM, MCP tools, Agent Monitor
Twilio / FusionPBXVoice GatewayAgent DashboardCall transfer to JsSIP
Agent DashboardFusionPBXUserJsSIP at /call-center

WebSocket Voice Interface

  • ✓ Browser: getUserMedia + MediaRecorder (WebM)
  • ✓ FastAPI WebSocket at /webrtc/ws
  • ✓ Base64 audio chunks; transcript_final, agent_final, audio_chunk
Technology: FastAPI WebSocket (no LiveKit on GCP)

Session & Agent Monitor

  • ✓ In-memory session state (voice-gateway)
  • ✓ Redis: agent-llm writes interactions (tool_calls, voice_timing)
  • ✓ Call-center serves /agent-monitor APIs from Redis
Integration: Redis for Agent Monitor (call-center reads)

Composio (optional)

  • Optional external tools; not required for GCP core flows

Module Structure (GCP Cloud Run)

Deployment & Services (v2.convonetai.com + hanok.convonetai.com)

Project Root/
├── cloudbuild.yaml                 # Build & deploy all 5 services to Cloud Run
├── cloudbuild-hanok-table.yaml     # Deploy only hanok-table-service
├── cloudbuild-callcenter.yaml      # Deploy only call-center-service
├── docker/                         # Per-service Dockerfiles
│   ├── voice-gateway.Dockerfile
│   ├── agent-llm.Dockerfile
│   ├── call-center.Dockerfile
│   ├── crm-integration.Dockerfile
│   └── hanok-table.Dockerfile
├── hanok_table/                    # Hanok reservation FastAPI package
├── templates/                      # Served by call-center-service
│   ├── index.html, voice_assistant.html, call_center.html
│   ├── agent_monitor_dashboard.html, mortgage_dashboard.html
│   ├── convonet_tech_spec.html, convonet_system_architecture.html, convonet_sequence_diagram.html
│   └── ...
└── convonet/
    ├── voice_gateway_service.py    # FastAPI: /webrtc/ws, /twilio/* (STT→agent-llm→TTS, transfer)
    ├── agent_llm_service.py        # FastAPI: /agent/process, /convonet_todo/* (LangGraph, MCP, mortgage)
    ├── call_center_service.py      # FastAPI: /, /call-center, /voice_assistant, /agent-monitor, docs
    ├── routes.py                   # Agent logic, LangGraph, tool execution (used by agent_llm_service)
    ├── agent_monitor.py            # Redis-backed interactions (tool_calls, voice_timing)
    ├── redis_manager.py            # Redis session, Agent Monitor, transfer context cache
    ├── deepgram/                   # STT/TTS (used by voice_gateway_service)
    ├── mcps/local_servers/          # MCP tools (db_todo, db_mortgage, db_hanok_table, call_transfer, etc.)
    ├── models/                     # SQLAlchemy (user_models: users_anthropic, voice_pin)
    └── ...                         # Legacy Flask/Socket.IO/LiveKit code not used on GCP

Path-based routing on v2.convonetai.com (and/or custom host mapping for hanok.convonetai.com) directs requests to the appropriate Cloud Run service.

Database (PostgreSQL)

Used by agent-llm-service (e.g. Render.com PostgreSQL). Connection via DB_URI; optional RENDER_POSTGRES_HOST_SUFFIX for Render host normalization.

users_anthropic

Voice PIN auth; voice_pin column for PIN validation.

todos_convonet

Todo CRUD via MCP/db_todo.

reminders_convonet

Reminders via MCP.

Mortgage / Healthcare

Applications, patients, etc. (see mcps/local_servers).

Full schema and migrations are in convonet/models/ and migrations/.

LangGraph Agent Architecture

LangGraph Workflow Diagram

LangGraph Workflow Diagram

LangGraph Workflow: The agent can either continue to use tools or end the conversation based on user input and context.

Agent Components

  • TodoAgent Class: Main agent orchestrator with lazy initialization
  • StateGraph: Manages conversation flow and state
  • Assistant Node: GPT-4 reasoning and response generation
  • Tool Node: Executes 38 MCP tools
  • Conditional Edges: Routes between nodes based on tool calls
  • InMemorySaver: Checkpointer for state persistence

State Management

  • AgentState: Conversation state with message history
  • Message History: Maintains context across turns
  • Customer ID: User identification for multi-tenant
  • Thread ID: Conversation thread tracking
  • Lazy Loading: Prevents circular imports
  • ExceptionGroup Handling: Robust error recovery

Model Context Protocol (MCP) Integration

MCP provides a standardized way for AI agents to interact with external tools and services. On GCP, agent-llm-service uses MCP to expose domain tools: database operations (todos, reminders, calendar), team management, mortgage/healthcare, Hanok reservation operations (via hanok-table-service), call transfer to FusionPBX, and Google Calendar. STT/TTS are handled by voice-gateway (Deepgram), not MCP.

MCP Server Configuration

{
  "mcpServers": {
    "db": {
      "command": "python",
      "args": ["./convonet/mcps/local_servers/db_todo.py"],
      "transport": "stdio",
      "env": {
        "DB_URI": "${DB_URI}",
        "GOOGLE_OAUTH2_TOKEN_B64": "${GOOGLE_OAUTH2_TOKEN_B64}",
        "GOOGLE_CLIENT_ID": "${GOOGLE_CLIENT_ID}",
        "GOOGLE_CLIENT_SECRET": "${GOOGLE_CLIENT_SECRET}"
      }
    }
  }
}

Available MCP Tools (38)

Todo Management (5)
  • • create_todo
  • • get_todos
  • • complete_todo
  • • update_todo
  • • delete_todo
Team Tools (8)
  • • create_team
  • • get_teams
  • • get_team_members
  • • create_team_todo
  • • add_team_member
  • • remove_team_member
  • • change_member_role
  • • search_users
Reminders (4)
  • • create_reminder
  • • get_reminders
  • • update_reminder
  • • delete_reminder
Calendar (6)
  • • create_calendar_event
  • • get_calendar_events
  • • update_calendar_event
  • • delete_calendar_event
  • • sync_google_calendar_events
  • • test_google_calendar
Call Transfer (2)
  • • transfer_to_agent
  • • get_available_departments

Enhanced LangGraph Tool Calls

The LangGraph implementation provides intelligent tool calling capabilities with dynamic tool selection and error handling. The agent automatically chooses appropriate tools based on user intent and maintains conversation context for seamless interactions.

Tool Calls Flow Diagram

Tool Calls Flow Diagram

Tool Calls Flow: LangGraph implementation showing dynamic tool selection and intelligent orchestration of MCP tools.

Tool Calls Features

MCP Integration
  • • Database operations via MCP servers
  • • Google Calendar synchronization
  • • Team collaboration tools
  • • Real-time tool discovery (38 tools)
  • • Secure tool communication via stdio
  • • Lazy loading for performance
Error Handling
  • • Graceful tool failure recovery
  • • ExceptionGroup unwrapping
  • • 20s timeout per tool
  • • 30s overall agent timeout
  • • Fallback strategies
  • • User-friendly error messages

Tool Features: Intelligent tool calling system with error recovery, timeout management, and seamless MCP integration.

Core Tool Calling Capabilities

  • Dynamic Tool Selection: LLM intelligently chooses appropriate tools based on user intent
  • Error Recovery: Graceful handling of tool failures with fallback strategies
  • Context Awareness: Tools access conversation history and maintain state
  • Streaming Responses: Real-time tool execution updates for better user experience
  • Async Execution: Non-blocking tool calls with proper timeout management
  • ExceptionGroup Handling: Unwraps and logs complex async exceptions

JWT Authentication System

Authentication Flow

1. User Registration
POST /api/auth/register → Bcrypt hash → JWT tokens
2. User Login
POST /api/auth/login → Verify password → Generate tokens
3. API Request
Bearer token → JWT validation → @require_auth
4. Token Refresh
POST /api/auth/refresh → New access token

Security Features

  • Password Hashing: Bcrypt with automatic salt
  • JWT Tokens: HS256 algorithm with secret key
  • Token Expiry: 30 min access, 7 day refresh
  • Authorization: @require_auth decorator
  • Role Validation: @require_role decorator
  • Team Membership: @require_team_member decorator
  • Auto Logout: Frontend handles expired tokens

JWT Token Structure

{
  "user_id": "uuid",
  "email": "user@example.com",
  "roles": ["user"],
  "team_id": "uuid",
  "type": "access",
  "exp": 1728589200,  // 30 minutes from issue
  "iat": 1728587400   // issued at timestamp
}

Redis (GCP)

Redis is used by voice-gateway-service and agent-llm-service (e.g. Redis Cloud). Configure via REDIS_HOST, REDIS_PORT, REDIS_PASSWORD.

  • Agent Monitor: agent-llm writes interactions (tool_calls, voice_timing, metadata) for the /agent-monitor dashboard.
  • Transfer context: On transfer to human agent, voice-gateway caches conversation_history and customer profile (e.g. callcenter:customer:{extension}:{call_sid}) for call-center UI.
  • Session/audio: Optional session and audio buffer storage for voice pipeline.

API Endpoints (v2.convonetai.com)

Paths are served via path-based routing on v2.convonetai.com, with optional custom-domain mapping for hanok-table-service (hanok.convonetai.com).

call-center-service

  • GET / Landing
  • GET /voice_assistant, /call-center, /agent-monitor, /tool-execution, /mortgage_dashboard
  • GET /convonet_tech_spec, /convonet_system_architecture, /convonet_sequence_diagram
  • GET/POST /call-center/api/* (agent status, login, call actions, customer data from Redis)
  • GET /agent-monitor/api/stats, /agent-monitor/api/interactions

voice-gateway-service

  • WebSocket /webrtc/ws (PIN auth, STT→agent-llm→TTS, greeting, processing status, barge-in)
  • POST /twilio/process_audio, /twilio/transfer_callback

agent-llm-service

  • POST /agent/process (LangGraph, multi-LLM, MCP tools, metadata.voice_timing, transfer_context)
  • GET /convonet_todo/api/mortgage/applications, /convonet_todo/api/mortgage/applications/{id}

crm-integration-service

  • /patient/*, /meeting/create, etc. (SuiteCRM)

hanok-table-service

  • GET/POST /health health check
  • /api/reservations/* reservation create/read/update/cancel
  • /webhooks/hanok_table/* webhook variables/call-control
  • /reservation/status, /reserve-online UI endpoints

Call Center Agent Dashboard

A complete browser-based SIP phone client with ACD (Automatic Call Distribution) capabilities, providing enterprise-grade call center management features for handling voice assistant transfers and customer support calls.

Agent Management

  • ✓ Secure agent authentication
  • ✓ SIP credential management
  • ✓ Session management
  • ✓ Agent state tracking (Ready/Not Ready/On Call/Wrap Up)
  • ✓ Time-in-state tracking
  • ✓ Activity logging

Call Handling

  • ✓ Incoming call notifications
  • ✓ Caller ID display
  • ✓ Answer/Reject controls
  • ✓ Call hold/unhold
  • ✓ Call transfer (blind & attended)
  • ✓ Outbound dialing
  • ✓ Call duration tracking

Customer Data Popup

  • ✓ Automatic customer info display
  • ✓ Customer ID & contact info
  • ✓ Account status & tier
  • ✓ Last contact date
  • ✓ Open tickets/cases
  • ✓ Lifetime value
  • ✓ Agent notes

SIP Integration

  • ✓ Browser-based SIP client (JsSIP)
  • ✓ WebRTC audio support
  • ✓ WebSocket Secure (WSS)
  • ✓ RFC 3261 compliant
  • ✓ Multiple codec support (G.711, Opus, G.722)
  • ✓ NAT traversal (STUN/TURN)

Dashboard Interface

  • ✓ Agent status panel
  • ✓ Call control panel
  • ✓ 12-key dialpad
  • ✓ Call history display
  • ✓ Real-time status updates
  • ✓ Responsive design (desktop/tablet/mobile)

Monitoring & Reporting

  • ✓ Agent metrics (calls handled, duration)
  • ✓ Call metrics (answer rate, wait time)
  • ✓ Real-time monitoring
  • ✓ Activity timeline
  • ✓ Availability percentage

Agent States

Logged Out
Agent not active
Logged In
Not ready for calls
Ready
Available for calls ✓
Not Ready
On break/unavailable
On Call
Currently on a call

Voice Assistant Transfer Integration

When a user requests to speak with a human agent during a WebRTC voice assistant session, the system automatically transfers the call to the Call Center Agent Dashboard:

Transfer Request: LangGraph detects transfer intent and triggers Twilio API call
Twilio Bridge: Twilio bridges the WebRTC call to FusionPBX SIP server
FusionPBX Routing: Call routed to agent extension (e.g., 2001)
Agent Dashboard: Agent dashboard receives incoming call with customer info popup
Call Answer: Agent answers and continues conversation with customer

API Endpoints

Agent Management
  • POST /call-center/api/agent/login
  • POST /call-center/api/agent/logout
  • POST /call-center/api/agent/ready
  • POST /call-center/api/agent/not-ready
  • GET /call-center/api/agent/status
Call Handling
  • POST /call-center/api/call/ringing
  • POST /call-center/api/call/answer
  • POST /call-center/api/call/drop
  • POST /call-center/api/call/hold
  • POST /call-center/api/call/transfer
Customer Data
  • GET /call-center/api/customer/{id}

Access: /call-center/

Browser-based SIP client requiring no installation. Compatible with Chrome, Firefox, Edge, and Safari. Integrates with FusionPBX for call routing and transfer from WebRTC voice assistant sessions.

Twilio & Voice (GCP)

Browser voice: Use /voice_assistant → WebSocket /webrtc/ws (voice-gateway-service). PIN auth (PostgreSQL or env), Deepgram STT/TTS, greeting after login, “please wait” + elapsed timer during processing, barge-in (Start stops playback and starts recording).

Phone (Twilio): Twilio webhooks target voice-gateway-service at POST /twilio/process_audio. STT/TTS are Deepgram (not Twilio/Polly). On transfer intent, agent-llm returns transfer_marker; voice-gateway returns TwiML <Dial><Sip> to FusionPBX (e.g. extension 2001). Config: VOICE_GATEWAY_PUBLIC_URL, FUSIONPBX_SIP_DOMAIN, TRANSFER_TIMEOUT. Callback /twilio/transfer_callback handles dial status.

Transfer context (conversation_history, user_id) is cached in Redis for the call-center UI. See docs/CLOUD_RUN_ENV.md for Twilio/FusionPBX env vars.

Validation

Validate the GCP deployment via /voice_assistant (WebSocket voice, PIN, greeting, barge-in), /agent-monitor (tool calls and voice timing), /call-center (SIP dashboard and customer popup from Redis), and /mortgage_dashboard. For phone flows, configure Twilio webhooks to VOICE_GATEWAY_PUBLIC_URL/twilio/process_audio and test transfer to FusionPBX.