Convonet System Architecture Diagram
FastAPI · Google Cloud Run · WebRTC/WebSocket · Domain Agents · Agent Monitor
GCP Cloud Run Microservices (v2.convonetai.com)
Client (Browser) · v2.convonetai.com
/voice_assistant · /call-center · /mortgage_dashboard · /agent-monitor · /tool-execution
Path-based routing (load balancer / API gateway)
↓call-center-service
FastAPI
/,/call-center,/voice_assistant/mortgage_dashboard,/agent-monitor/tool-execution, static & doc pages
voice-gateway-service
FastAPI
/webrtc/ws,/twilio/*- WebSocket + Twilio webhooks
- STT → Agent → TTS pipeline
agent-llm-service
FastAPI
/agent/process/convonet_todo/*- LangGraph, multi-LLM, MCP tools
crm-integration-service
FastAPI
/patient/*/meeting/create,/case/create- SuiteCRM · patient / case / note
hanok-table-service
FastAPI
/health,/api/reservations/*/webhooks/hanok_table/*- Restaurant reservations · Hanok MCP-backed API
Redis
Session, buffer, pub/sub
Deepgram
STT / TTS
PostgreSQL (Render)
users_anthropic, todos, mortgage
Twilio · FusionPBX
Transfer to agent
Request & data flow (v2)
How traffic moves across Cloud Run services for voice, dashboards, LLM tools, and CRM—replacing the old single-service diagram.
Color-Coded Components
User / Client
Browser (Voice Assistant, Call Center, Dashboards)
FastAPI Services
Cloud Run: voice-gateway, agent-llm, call-center, crm-integration, hanok-table
AI/ML
LangGraph, Claude/Gemini/OpenAI, Deepgram STT/TTS
Tools/APIs
MCP tools, PostgreSQL, Google APIs, SuiteCRM
Transfer
Twilio, FusionPBX (SIP)
Agent Dashboard
JsSIP client, Call Center UI
Agent Monitor
Tool calls, voice timing (call-center UI)
Storage
Redis, PostgreSQL
System Components
User Browser (Voice Assistant UI)
Browser-based voice interface at /voice_assistant (or /voice-assistant). Uses FastAPI WebSocket to voice-gateway-service at /webrtc/ws. No LiveKit on GCP—audio is captured via getUserMedia + MediaRecorder, sent as base64 chunks, and TTS is played back from server-sent audio.
- WebSocket to voice-gateway (
wss://v2.convonetai.com/webrtc/ws) - PIN authentication (validated by voice-gateway against PostgreSQL or env)
- Domain agents (Productivity, Mortgage, Healthcare, Hanok/Restaurant) selected by agent-llm-service via intent
Voice Gateway Service (FastAPI)
Cloud Run service: convonet/voice_gateway_service.py. Handles WebSocket /webrtc/ws and Twilio webhooks (/twilio/*).
- WebSocket: Session state, PIN auth (PostgreSQL
users_anthropicorVOICE_PINenv), start/audio_chunk/stop_recording - Pipeline: Batch STT (Deepgram) → HTTP
POSTto agent-llm-service/agent/processwithmetadata(source: voice, voice_timing, stt_provider) for Agent Monitor → TTS (Deepgram) → send transcript_final, agent_final, audio_chunk to client - Twilio: call, verify_pin, process_audio; transfer bridge to FusionPBX
Agent LLM Service (FastAPI)
Cloud Run service: convonet/agent_llm_service.py. HTTP API for agent/LLM processing.
POST /agent/process: prompt, user_id, session_id → LangGraph, Multi-LLM (Claude/Gemini/OpenAI), MCP tools- Intent-based routing: Mortgage, Healthcare, Todo (productivity) agents
/convonet_todo/api/llm-providers, STT/TTS provider APIs (Redis-backed)
Call Center Service (FastAPI)
Cloud Run service: convonet/call_center_service.py. Serves all UI and doc pages under v2.convonetai.com.
- Landing (
/), Call Center (/call-center), Voice Assistant UI (/voice_assistant,/voice-assistant), Mortgage Dashboard (/mortgage_dashboard) - Agent Monitor (
/agent-monitor), Tool Execution (/tool-execution), doc routes (Tech Spec, System Architecture, Sequence Diagram) - Static assets and stub APIs for dashboards
CRM Integration Service (FastAPI)
Cloud Run service for SuiteCRM: patient search/create, meeting/case/note creation. Routes: /patient/*, /meeting/create, /case/create, /note/create.
Hanok Table Service (FastAPI)
Cloud Run service: hanok_table/app.py. Restaurant reservation API + webhooks for Hanok tools and UI.
- Reservation API:
/api/reservations/*,/reservation/status - Voice/webhooks:
/webhooks/hanok_table/* - Health + MCP mount support:
/health, optional mounted MCP HTTP app
Redis (Session & Buffer)
Used by agent-llm-service (and optionally voice-gateway) for session state, provider config, and buffering. Redis Cloud or GCP Redis.
Speech (Deepgram)
Voice-gateway uses Deepgram for batch STT and TTS. API keys via environment variables on voice-gateway-service.
AI Orchestration (Agent LLM Service)
LangGraph in convonet/routes.py and assistant_graph_todo.py: state machine, tool conditions, intent detection. Multi-LLM and MCP tools (todos, mortgage, healthcare, hanok reservations, transfer).
Agent Monitor
UI at /agent-monitor (call-center-service). Reads from Redis: agent-llm-service writes interactions (tool_calls, metadata.voice_timing) via AgentMonitor.track_interaction(); call-center-service exposes /agent-monitor/api/stats and /agent-monitor/api/interactions.
Tools & External APIs
MCP tools (e.g. db_todo.py, db_mortgage.py) run inside agent-llm-service. PostgreSQL (Render): todos, users_anthropic, mortgage data. Google Calendar/OAuth, FusionPBX metadata where applicable.
Transfer System
Twilio API (voice-gateway) bridges to FusionPBX (SIP). Agent dashboard (JsSIP) registers with FusionPBX; receives call at extension (e.g. 2001).
Agent Dashboard
Call Center UI at /call-center. JsSIP client for SIP registration and incoming calls from transfer.
Data Flow Summary
Voice Flow (WebSocket, no LiveKit)
Browser → WebSocket voice-gateway (/webrtc/ws) → PIN auth (PostgreSQL or env) →
STT (Deepgram) → HTTP agent-llm-service /agent/process → LangGraph / Multi-LLM / MCP →
TTS (Deepgram) → voice-gateway → Browser (transcript_final, agent_final, audio_chunk)
Transfer Flow
User request → agent-llm (transfer intent) → voice-gateway (Twilio) → FusionPBX (SIP) → Agent Dashboard (JsSIP) → User info / live conversation
Single Domain
All traffic at https://v2.convonetai.com; path-based routing to call-center, voice-gateway, agent-llm, crm-integration, and hanok-table. No CORS needed for same-origin UI.
Architecture Overview
Convonet runs as five FastAPI microservices on Google Cloud Run, exposed under a single domain (v2.convonetai.com). The voice-gateway-service handles WebSocket voice at /webrtc/ws (no LiveKit on GCP): browser sends audio via WebSocket; the gateway runs batch STT (Deepgram), calls agent-llm-service over HTTP for LLM and MCP tools, then TTS (Deepgram), and streams results back. PIN authentication is validated against PostgreSQL users_anthropic or a fallback env PIN.
The agent-llm-service hosts LangGraph, multi-LLM (Claude, Gemini, OpenAI), and domain agents (Productivity/Todo, Mortgage, Healthcare, Hanok/Restaurant). The call-center-service serves the landing page, voice assistant UI, call center, mortgage dashboard, agent monitor, tool execution dashboard, and doc pages. The crm-integration-service handles SuiteCRM operations, and hanok-table-service handles reservation APIs/webhooks used by Hanok MCP tools. Twilio and FusionPBX provide call transfer to the agent dashboard (JsSIP). Redis and PostgreSQL (e.g. Render) are used for session, provider config, and application data.