How does tenant isolation work in the Concierge?

The central API route resolves the incoming tenant server-side and loads its knowledge and persona files from a tenant-scoped store. The system prompt is assembled before the Bedrock call, so a dental practice and a media client never share context. Both files are cached briefly in memory to keep latency low.

What stack is the Concierge built on?

Next.js App Router for the API surface, Bedrock for model inference, S3 for per-tenant knowledge and persona files, and DynamoDB for ephemeral session state. Responses stream to the client over SSE. The embeddable widget is a single component that accepts endpoint, tenant, and optional styling props.

What happens when a tenant hits the monthly cap?

Monthly per-tenant message caps are enforced at the ingest layer, with Slack alerts firing as a tenant approaches its limit. When the cap is hit, an inline lead capture form surfaces to preserve the conversion path. A per-visitor daily cap also guards against runaway usage from a single session.

AI Concierge — iSimplifyMe Labs

Q: What guardrails protect against prompt injection?

Guardrails run in layers. An input classifier with twenty regex patterns screens messages before they reach Bedrock. A schema validator checks knowledge.json and persona.json on load for suspicious patterns, length violations, and URI scheme anomalies. An injection_resistance block is prepended to every tenant system prompt.

Abstract

Concierge is a multi-tenant AI chat surface that embeds across any site in the iSM network. Each deployment carries its own knowledge base, persona, and conversion protocol — served from a shared infrastructure layer with per-tenant S3 storage and session isolation.

Problem

Generic chat widgets share a single system prompt and have no concept of which site they're on or what audience they serve. A dental practice, a media company, and a SaaS product all need different voices, different knowledge, and different conversion goals from the same underlying chat infrastructure.

Off-the-shelf tools force a choice between customization and maintainability. A shared system with strong tenant isolation lets one infrastructure serve many surface-level personalities without coupling their knowledge bases together.

Approach

Tenant isolation

The central API route resolves the incoming tenant from the x-tenant-domain header, then loads knowledge.json and persona.json from a per-tenant S3 prefix. The tenant-specific system prompt is assembled server-side before the Bedrock call, so no tenant data crosses a shared context window.

Both files are cached in memory for five minutes. Responses stream via SSE to the client.

Session and usage controls

Sessions are stored in DynamoDB with a 30-minute TTL; a per-visitor daily cap and a per-tenant monthly cap guard against runaway usage. The client-side widget is a single embeddable component that accepts endpoint, tenant, and optional styling props.

Guardrails

Guardrails run in layers: an input classifier (20 regex patterns) screens for injection attempts before the message reaches Bedrock; a schema validator checks knowledge.json and persona.json on load for suspicious patterns, length violations, and URI scheme anomalies. A <injection_resistance> block is prepended to every tenant system prompt.

Status

Live on multiple iSM-network sites with per-tenant knowledge and persona files.
The admin dashboard shows conversation history with search, date filters, and pagination.
Monthly cap set at 1,000 messages per tenant; Slack alerts fire at 50%, 75%, and 90% of cap.
Inline lead capture form surfaces when the monthly cap is hit, preserving conversion path.
Planned: admin UI for editing knowledge and persona without S3 uploads; DynamoDB-backed domain map to replace the hardcoded DOMAIN_MAP.

AI Concierge

What is the iSimplifyMe AI Concierge?