Abstract
Concierge is a multi-tenant AI chat surface that embeds across any site in the iSM network. Each deployment carries its own knowledge base, persona, and conversion protocol — served from a shared infrastructure layer with per-tenant S3 storage and session isolation.
Problem
Generic chat widgets share a single system prompt and have no concept of which site they're on or what audience they serve. A dental practice, a media company, and a SaaS product all need different voices, different knowledge, and different conversion goals from the same underlying chat infrastructure.
Off-the-shelf tools force a choice between customization and maintainability. A shared system with strong tenant isolation lets one infrastructure serve many surface-level personalities without coupling their knowledge bases together.
Approach
Tenant isolation
The central API route resolves the incoming tenant from the `x-tenant-domain` header, then loads `knowledge.json` and `persona.json` from a per-tenant S3 prefix. The tenant-specific system prompt is assembled server-side before the Bedrock call, so no tenant data crosses a shared context window.
Both files are cached in memory for five minutes. Responses stream via SSE to the client.
Session and usage controls
Sessions are stored in DynamoDB with a 30-minute TTL; a per-visitor daily cap and a per-tenant monthly cap guard against runaway usage. The client-side widget is a single embeddable component that accepts endpoint, tenant, and optional styling props.
Guardrails
Guardrails run in layers: an input classifier (20 regex patterns) screens for injection attempts before the message reaches Bedrock; a schema validator checks `knowledge.json` and `persona.json` on load for suspicious patterns, length violations, and URI scheme anomalies. A `
Status
- Live on multiple iSM-network sites with per-tenant knowledge and persona files.
- The admin dashboard shows conversation history with search, date filters, and pagination.
- Monthly cap set at 1,000 messages per tenant; Slack alerts fire at 50%, 75%, and 90% of cap.
- Inline lead capture form surfaces when the monthly cap is hit, preserving conversion path.
- Planned: admin UI for editing knowledge and persona without S3 uploads; DynamoDB-backed domain map to replace the hardcoded DOMAIN_MAP.