Convonet System Architecture Diagram

FastAPI · Google Cloud Run · WebRTC/WebSocket · Domain Agents · Agent Monitor

GCP Cloud Run Microservices (v2.convonetai.com)

Client (Browser) · v2.convonetai.com

/voice_assistant · /call-center · /mortgage_dashboard · /agent-monitor · /tool-execution

Path-based routing (load balancer / API gateway)

call-center-service

FastAPI

  • /, /call-center, /voice_assistant
  • /mortgage_dashboard, /agent-monitor
  • /tool-execution, static & doc pages

voice-gateway-service

FastAPI

  • /webrtc/ws, /twilio/*
  • WebSocket + Twilio webhooks
  • STT → Agent → TTS pipeline

agent-llm-service

FastAPI

  • /agent/process
  • /convonet_todo/*
  • LangGraph, multi-LLM, MCP tools

crm-integration-service

FastAPI

  • /patient/*
  • /meeting/create, /case/create
  • SuiteCRM · patient / case / note

hanok-table-service

FastAPI

  • /health, /api/reservations/*
  • /webhooks/hanok_table/*
  • Restaurant reservations · Hanok MCP-backed API
voice-gateway POST /agent/process agent-llm

Redis

Session, buffer, pub/sub

Deepgram

STT / TTS

PostgreSQL (Render)

users_anthropic, todos, mortgage

Twilio · FusionPBX

Transfer to agent

Request & data flow (v2)

How traffic moves across Cloud Run services for voice, dashboards, LLM tools, and CRM—replacing the old single-service diagram.

Convonet v2 data flow: browser to voice-gateway, agent-llm, Hanok, and backends Browser · v2.convonetai.com Voice UI · Call Center · Dashboards · Docs path routing voice-gateway-service WebSocket /webrtc/ws Twilio · PIN · batch STT/TTS POST → agent-llm /agent/process call-center-service HTML · static · Jinja pages Agent Monitor API → Redis Browser fetch → agent-llm APIs Deepgram STT/TTS Twilio transfer agent-llm-service LangGraph · Claude / Gemini / OpenAI Intent routing · Todo / Mortgage / Healthcare / Hanok MCP subprocesses (stdio) crm-integration SuiteCRM REST patient · meeting · case HTTP hanok-table-service reservations · webhooks · /health HTTP Redis sessions · Agent Monitor PostgreSQL users · todos · mortgage External APIs MCP tools · calendars · search SuiteCRM (external) via crm-integration only Legend Violet arrows: HTTP to agent-llm · Green: optional MCP→crm-integration path (or direct SuiteCRM from MCP) Voice path: bidirectional WebSocket; one turn = STT → LLM → TTS streamed back to the browser

Color-Coded Components

User / Client

Browser (Voice Assistant, Call Center, Dashboards)

FastAPI Services

Cloud Run: voice-gateway, agent-llm, call-center, crm-integration, hanok-table

AI/ML

LangGraph, Claude/Gemini/OpenAI, Deepgram STT/TTS

Tools/APIs

MCP tools, PostgreSQL, Google APIs, SuiteCRM

Transfer

Twilio, FusionPBX (SIP)

Agent Dashboard

JsSIP client, Call Center UI

Agent Monitor

Tool calls, voice timing (call-center UI)

Storage

Redis, PostgreSQL

System Components

User Browser (Voice Assistant UI)

Browser-based voice interface at /voice_assistant (or /voice-assistant). Uses FastAPI WebSocket to voice-gateway-service at /webrtc/ws. No LiveKit on GCP—audio is captured via getUserMedia + MediaRecorder, sent as base64 chunks, and TTS is played back from server-sent audio.

  • WebSocket to voice-gateway (wss://v2.convonetai.com/webrtc/ws)
  • PIN authentication (validated by voice-gateway against PostgreSQL or env)
  • Domain agents (Productivity, Mortgage, Healthcare, Hanok/Restaurant) selected by agent-llm-service via intent

Voice Gateway Service (FastAPI)

Cloud Run service: convonet/voice_gateway_service.py. Handles WebSocket /webrtc/ws and Twilio webhooks (/twilio/*).

  • WebSocket: Session state, PIN auth (PostgreSQL users_anthropic or VOICE_PIN env), start/audio_chunk/stop_recording
  • Pipeline: Batch STT (Deepgram) → HTTP POST to agent-llm-service /agent/process with metadata (source: voice, voice_timing, stt_provider) for Agent Monitor → TTS (Deepgram) → send transcript_final, agent_final, audio_chunk to client
  • Twilio: call, verify_pin, process_audio; transfer bridge to FusionPBX

Agent LLM Service (FastAPI)

Cloud Run service: convonet/agent_llm_service.py. HTTP API for agent/LLM processing.

  • POST /agent/process: prompt, user_id, session_id → LangGraph, Multi-LLM (Claude/Gemini/OpenAI), MCP tools
  • Intent-based routing: Mortgage, Healthcare, Todo (productivity) agents
  • /convonet_todo/api/llm-providers, STT/TTS provider APIs (Redis-backed)

Call Center Service (FastAPI)

Cloud Run service: convonet/call_center_service.py. Serves all UI and doc pages under v2.convonetai.com.

  • Landing (/), Call Center (/call-center), Voice Assistant UI (/voice_assistant, /voice-assistant), Mortgage Dashboard (/mortgage_dashboard)
  • Agent Monitor (/agent-monitor), Tool Execution (/tool-execution), doc routes (Tech Spec, System Architecture, Sequence Diagram)
  • Static assets and stub APIs for dashboards

CRM Integration Service (FastAPI)

Cloud Run service for SuiteCRM: patient search/create, meeting/case/note creation. Routes: /patient/*, /meeting/create, /case/create, /note/create.

Hanok Table Service (FastAPI)

Cloud Run service: hanok_table/app.py. Restaurant reservation API + webhooks for Hanok tools and UI.

  • Reservation API: /api/reservations/*, /reservation/status
  • Voice/webhooks: /webhooks/hanok_table/*
  • Health + MCP mount support: /health, optional mounted MCP HTTP app

Redis (Session & Buffer)

Used by agent-llm-service (and optionally voice-gateway) for session state, provider config, and buffering. Redis Cloud or GCP Redis.

Speech (Deepgram)

Voice-gateway uses Deepgram for batch STT and TTS. API keys via environment variables on voice-gateway-service.

AI Orchestration (Agent LLM Service)

LangGraph in convonet/routes.py and assistant_graph_todo.py: state machine, tool conditions, intent detection. Multi-LLM and MCP tools (todos, mortgage, healthcare, hanok reservations, transfer).

Agent Monitor

UI at /agent-monitor (call-center-service). Reads from Redis: agent-llm-service writes interactions (tool_calls, metadata.voice_timing) via AgentMonitor.track_interaction(); call-center-service exposes /agent-monitor/api/stats and /agent-monitor/api/interactions.

Tools & External APIs

MCP tools (e.g. db_todo.py, db_mortgage.py) run inside agent-llm-service. PostgreSQL (Render): todos, users_anthropic, mortgage data. Google Calendar/OAuth, FusionPBX metadata where applicable.

Transfer System

Twilio API (voice-gateway) bridges to FusionPBX (SIP). Agent dashboard (JsSIP) registers with FusionPBX; receives call at extension (e.g. 2001).

Agent Dashboard

Call Center UI at /call-center. JsSIP client for SIP registration and incoming calls from transfer.

Data Flow Summary

Voice Flow (WebSocket, no LiveKit)

Browser → WebSocket voice-gateway (/webrtc/ws) → PIN auth (PostgreSQL or env) → STT (Deepgram) → HTTP agent-llm-service /agent/processLangGraph / Multi-LLM / MCPTTS (Deepgram)voice-gatewayBrowser (transcript_final, agent_final, audio_chunk)

Transfer Flow

User requestagent-llm (transfer intent) → voice-gateway (Twilio) → FusionPBX (SIP)Agent Dashboard (JsSIP)User info / live conversation

Single Domain

All traffic at https://v2.convonetai.com; path-based routing to call-center, voice-gateway, agent-llm, crm-integration, and hanok-table. No CORS needed for same-origin UI.

Architecture Overview

Convonet runs as five FastAPI microservices on Google Cloud Run, exposed under a single domain (v2.convonetai.com). The voice-gateway-service handles WebSocket voice at /webrtc/ws (no LiveKit on GCP): browser sends audio via WebSocket; the gateway runs batch STT (Deepgram), calls agent-llm-service over HTTP for LLM and MCP tools, then TTS (Deepgram), and streams results back. PIN authentication is validated against PostgreSQL users_anthropic or a fallback env PIN.

The agent-llm-service hosts LangGraph, multi-LLM (Claude, Gemini, OpenAI), and domain agents (Productivity/Todo, Mortgage, Healthcare, Hanok/Restaurant). The call-center-service serves the landing page, voice assistant UI, call center, mortgage dashboard, agent monitor, tool execution dashboard, and doc pages. The crm-integration-service handles SuiteCRM operations, and hanok-table-service handles reservation APIs/webhooks used by Hanok MCP tools. Twilio and FusionPBX provide call transfer to the agent dashboard (JsSIP). Redis and PostgreSQL (e.g. Render) are used for session, provider config, and application data.