Callstack AI — Multi-Tenant Voice & Chat Architecture

Self-hosted AI contact center. One LangChain ReAct agent behind every channel.
Twilio PSTN · Azure Voice Live · CopilotKit chat · embeddable widgets.

Try a Demo
LangChain
LangGraph
Twilio
CopilotKit
OpenAI
Claude
Twilio Inbound
POST /voice/twilio/incoming
Global webhook auto-resolves tenant from dialed Twilio number. TwiML response opens Media Stream.
Twilio Outbound
POST /voice/twilio/outbound
REST call originates PSTN from tenant number. Per-tenant concurrency cap enforced.
Browser Voice
WS /voice/ws/{client_id}
Live Azure Voice Live session over WebSocket. Bidirectional PCM16 24 kHz audio.
Web Chat
POST /agents/agenticChatbot
AG-UI streaming endpoint via CopilotKit runtime proxy. Server-Sent Events.
Embed Widget
/embed/loader.js
Vite-built React widget. Popup bubble, sidebar, or full-page chat via tenant key.
Outbound Calls on Demand
Trigger AI-driven Twilio outbound calls via a single REST endpoint — perfect for CRM integration, reminders, surveys, and proactive outreach.
Inbound Auto-Routing
One global webhook auto-resolves tenant from the dialed Twilio number. Add a tenant, assign a number, done — no per-tenant webhook config.
Concurrent Call Capacity
Per-tenant concurrency caps. Run 1, 10, or 100 AI calls in parallel — configurable per client, throttled automatically.
Live Web Search
Tavily integration with per-tenant domain allow/blocklists. Answers stay grounded, on-brand, and current — never stale training data.
Multi-Tenant Isolation
Each tenant: its own FAISS vector index, document registry, phone number, system prompt, and access key. SHA-256 hashed — never plaintext.
Self-Hosted — Pay Only for LLM
Deploy on your own Docker infrastructure. No per-seat fees, no usage markup. You pay Azure OpenAI directly. Software is yours forever.
RAG Over Your Documents
Ingest PDF, DOCX, TXT. Incremental re-indexing via document registry — only changed files are re-embedded. Source citations in every response.
Voice ⇄ Chat Sync
Every voice call is written back to the tenant's chat history — one unified conversation log across every channel, reviewable by your team.
Encrypted Vault
All API keys encrypted (Fernet AES) in MongoDB. Rotate keys through the browser UI — no SSH, no redeploy, no downtime.
Per-Tenant System Prompts
Company context, industry terminology, competitor awareness, and tone — all injected into every response for brand-safe output.
Feedback Loop
Thumbs up/down + categorized labels (factually incorrect, unsafe, lazy). Stored per tenant for fine-tuning, QA, and continuous improvement.
Admin Console
Manage tenants, upload documents, view call history, listen to transcripts, and rotate keys — all from a single React admin UI.
Streaming Responses
Server-Sent Events over the AG-UI protocol. Tool calls, source citations, and tokens stream live to the UI — no waiting for full completions.
Thread Persistence
MongoDB checkpointing via LangGraph. Conversations resume across sessions, devices, and channels — voice or chat, same memory.
GDPR & Data Residency
User data stays in your MongoDB, on your servers, in your jurisdiction. Vault stores infrastructure secrets only — no user PII leaves your stack.
frontend/
React 18 · nginx · :8000
Admin dashboard + multi-tenant chat UI. React + TypeScript + Tailwind + Radix UI + @copilotkit/react-core 1.54. Manages tenants, documents, vault, calls, and feedback.
frontend-runtime/
Node + Express · :3000
@copilotkit/runtime 1.54 proxy. Mediates between React frontend and FastAPI /agents/agenticChatbot. Forwards AG-UI streaming events.
frontend-embed/
Vite · static @ /embed
Embeddable widget bundle. React 18 + Vite + Tailwind. Three surfaces: popup bubble, sidebar, full-page chat. Initialized with tenant access key.
Agent Runtime
Flat create_agent() ReAct loop — no StateGraph nodes
AzureChatOpenAI · temperature=0.2 name: agenticChatbot
LangChain LangGraph CopilotKit AG-UI
retrieve_documents_tool
RAG
FAISS MMR · k=5 · fetch_k=20
Async tool. Reads tenant_id + rag_top_k from runtime state. Loads per-tenant FAISS index and retrieves chunks via MMR (lambda_mult=0.7). Sources surfaced to UI.
search_web_tool
Tavily · tenant domain filters
Async tool. POSTs to api.tavily.com/search scoped to tenant's search_include_domains allowlist and search_exclude_domains blocklist.
AgentRuntimeState
Extends AgentState with sources: list[dict]. Tenant system prompt injected per-invocation as context.
multi_chatbot_core/agent_runtime.py
CopilotKitMiddleware
Emits AG-UI streaming events — tool calls, intermediate tokens, and source citations — to the CopilotKit runtime proxy.
copilotkit package
MongoDBSaver Checkpointer
Persistent state. Threads resume across sessions and channels. A separate MemorySaver instance powers the suggestions agent.
checkpoints · checkpoint_writes · TTL 24h
VoiceSessionHandler
azure.ai.voicelive.aio
Full session lifecycle: config, audio relay, transcription, barge-in, function calls. Replays up to 10 prior thread messages on connect. Persists turns to LangGraph checkpoint.
Function Tools
retrieve_documents search_web
TwilioMediaAdapter
μ-law 8kHz ⇄ PCM16 24kHz
Bidirectional codec bridge via audioop. Installed as send_message callback on VoiceSessionHandler. Handles ulaw2lin + ratecv both directions.
Route
WS /voice/twilio/stream/{tenant}
voice_metrics
ring buffer · maxlen=500
Tracks end-of-speech → first-audio TTFT per tenant. In-memory deque. Exposed via GET /voice/stats.
Methods
record_turn() snapshot()
webhook: POST /voice/twilio/incoming status: POST /voice/twilio/status hangup: POST /voice/twilio/calls/{sid}/hangup browser: WS /voice/ws/{client_id}
/vectorstore RAG
POST refill refill-selected wipe status documents validate upload
FAISS index management per tenant.
/threads
list {id} {id}/state conversation agui_messages
Thread CRUD + hydration for voice.
/feedbacks
POST GET {id} {id}
Up/down votes with category labels.
/tenants-manage admin
auth POST {id} {id} regenerate-key vault-keys
Tenant CRUD + vault mgmt (master-JWT gated).
/admin
login validate
Master key → short-lived adm.* JWT.
/agent-runtime
eval
Non-streaming invocation for eval/testing.
/voice voice
WS ws/{id} WS twilio/stream twilio/incoming twilio/outbound twilio/status stats
Voice Live + Twilio PSTN endpoints.
/agents no auth
agenticChatbot suggestionsProvider
AG-UI streaming entry for CopilotKit.
State & Memory
checkpoints checkpoint_writes
LangGraph thread state · TTL 24h
Tenancy & Secrets
tenants vault
SHA-256 access keys · Fernet AES secrets
Voice & Telephony
phone_calls twilio_events
Call lifecycle + webhook idempotency (TTL 24h)
RAG & Feedback
document_registry feedbacks
File hash tracking for incremental loader · up/down votes
Identity & Auth
id slug name access_key (hashed)
Agent & Retrieval
max_parallel_search_queries max_intention_clarify_attempts search_include_domains search_exclude_domains web_search_enabled
Voice & Telephony
voice_enabled voice_id voice_model voice_greeting voice_disclosure_enabled twilio_phone_number max_concurrent_calls
Brand & Prompt — company_context
company_name company_industry primary_product services_list domain_topics terminology_examples competitor_context assistant_role
Azure OpenAI Chat
GPT-4.1 · AzureChatOpenAI
Azure OpenAI Embeddings
FAISS vectors
Azure Voice Live
WebSocket · PCM16 24kHz
Twilio REST
Programmable Voice
Twilio Media Streams
μ-law 8kHz audio
Tavily Search
web tool · allowlist
mongodb
mongo:6.0
Persistent state. Port 27017 (host-configurable).
app
FastAPI · :8001
Backend API + agent runtime + voice bridge. Internal port.
frontend
nginx · :8000
React admin SPA + chat UI. Reverse-proxies to app and frontend-runtime.
frontend-runtime
Node + Express · :3000
CopilotKit Runtime proxy. Bridges React → FastAPI /agents/agenticChatbot.
Self-Hosted · Any Cloud uv + Python 3.12 Docker Compose GDPR · Data Residency
5
entry channels
8
API endpoint groups
2
agent tools
8
MongoDB collections
4
Docker services