Skip to main content
Home/Services/AI Risk & Governance
SERVICE

Validator Gap Audit

A focused 4-hour engagement that walks every AI surface in your stack, classifies it deterministic-required vs bounded-probabilistic, and delivers a written gap report with the validator gates each surface needs.

HQChicago, IL
APACMelbourne, AU
StackAWS · Next.js · Nexus
CategoryAI Risk & Governance

A focused 4-hour engagement for organizations that have shipped AI to production — or are about to — and need an engineering-grade answer to the question Mark Cuban put on the public record on May 4, 2026: *can you make sure everyone gets the same answer to the same question, every time?* Fixed price $2,500. One audit per week.

What is the Validator Gap Audit?

The Validator Gap Audit is a 4-hour focused engagement that walks every AI surface in a prospect's stack, classifies each as either deterministic-required (medical records, contract clauses, regulated lookups) or bounded-probabilistic (chat tone, ideation, creative variants), and delivers a written gap report naming the validator gate each deterministic surface requires. The five validator primitives evaluated are schema gates, substring/pattern gates, structural rule evaluators, threshold scorers, and bounded LLM critics. Fixed price $2,500; deliverable in five business days.

The audit is built around a single architectural premise: not every AI surface needs to be deterministic, but every deterministic-required surface needs a named gate. The premise comes from real production systems — when iSM's own concierge widget receives the phrase *"if this is a medical emergency,"* the response is not generated by a language model. A precommitted substring match fires deterministic emergency-routing UI before the LLM gets a turn. The substring is the gate; the LLM is the conversation.

That architectural pattern repeats across every well-designed enterprise AI deployment we have audited. Every poorly-designed one fails to draw the line. The audit's job is to find every place the line should be drawn in your stack and tell you, on paper, what to do about it.

Why this exists in 2026

The determinism gap — the difference between traditional software (X always equals Y) and large language models (X usually equals Y) — has emerged as the central enterprise AI procurement question. Buyers increasingly evaluate vendors not on model selection but on the validator architecture beneath the AI surface. The Validator Gap Audit produces the artifact that answers that question for any AI deployment: an inventory of every surface, a classification per surface, and a named gate per deterministic surface.

For most of 2024 and 2025 the enterprise AI conversation was dominated by model selection — *which Claude model, which OpenAI version, which Bedrock configuration.* By the second half of 2025 the conversation moved to prompt engineering and tool use.

By spring 2026 the conversation has moved again, this time to a question that has been quietly visible to operators of production AI systems for the past 18 months: which AI output is allowed to take effect, and what gate did it pass through?

That question is now in the public discourse. Cuban's May 4 tweet — *"Judgement and the ability to challenge AI output is becoming increasingly necessary, and valuable"* — names the buyer-side concern. The supply-side answer has existed for those 18 months in shipping production code at boutique firms. iSM's substrate-A response post on the determinism gap lays out the full architectural taxonomy.

The Validator Gap Audit is the diagnostic engagement that produces the buyer's-side answer for a specific deployment. It is not a strategic deck. It is not a workshop. It is an engineering audit that produces a written artifact a CIO, CTO, or Chief Risk Officer can hand to their auditor.

What you get

  • Live audit session, 4 hours. Joseph W. Elstner walks every AI surface in the prospect's stack with the buyer's technical lead. Surfaces are catalogued, classified deterministic-required or bounded-probabilistic, and queued for gate evaluation.
  • Gap report, 6 to 12 pages. Written deliverable: surface inventory, per-surface classification, named validator-gate recommendations for each deterministic surface, ordered remediation list by liability exposure, and a "what's missing today" summary readable by both engineering and risk leadership.
  • 60-minute strategic briefing. Live walkthrough of the gap report with Joseph W. Elstner. Q&A. We do not leave the room with the report unread.
  • Audit credit toward a Validator Architecture build. If the buyer engages iSM to build the validator infrastructure within 60 days, the $2,500 audit fee credits 100% against the build engagement.
The deliverable is engineering-grade. It uses the same per-surface classification matrix iSM uses internally to evaluate every regulated-industry build. It does not pad with strategy slides. It does not propose a transformation roadmap. It tells you which AI surfaces in your production stack are exposed and what to install per surface to close the exposure.

The five validator primitives

The five validator primitives the audit evaluates are: schema gates (JSON validators that reject malformed AI output at API boundaries), substring/pattern gates (precommitted match rules that fire deterministic routing before the LLM responds), structural rule evaluators (versioned deterministic checks against output structure such as required fields and formatting), threshold scorers (numeric quality scores with per-tenant gates), and bounded LLM critics (a second model in critic role grounded against an immutable ground-truth source). Each surface in the gap report is mapped to one or more of these primitives.

Schema gates. A JSON-schema validator that rejects malformed AI output at the API boundary. The model produces JSON, the schema rejects payloads that fail validation, and the system either retries with the error context or hands off to a human queue. This is the cheapest, highest-leverage validator pattern. Every enterprise AI deployment that touches a system of record should have this layer. The audit identifies which surfaces lack one and where the schema needs to live in the request path.

Substring and pattern gates. Precommitted match rules that fire deterministic routing before the language model gets a turn. iSM's concierge widget across multiple regulated-industry tenants uses this pattern for emergency-routing — a phrase like *"if this is a medical emergency"* triggers a deterministic UI before any LLM response is generated. The audit identifies surfaces where life-safety, compliance, or financial-limit logic is currently delegated to the LLM and should be moved upstream of it.

Structural rule evaluators. Versioned deterministic checks against output structure — required fields present, formatting boundaries respected, hierarchy correct. iSM's content pipeline uses this pattern: every AI-generated draft passes through a quality-check Lambda that evaluates atomic-answer presence, FAQ count, SEO field length, word count, and H2 hierarchy before the draft can advance to deployment. The audit identifies surfaces where output structure is currently checked by humans and should be machine-checked.

Threshold scorers. Numeric quality scores with per-tenant gates. Drafts above the threshold advance; drafts below are held for review. The threshold is a per-tenant knob, not a hardcoded global. iSM's content pipeline uses an aeoThreshold field in the per-tenant PIPELINE_CONFIG record. The audit identifies surfaces where a numeric score could be computed and gated, replacing subjective human review at scale.

Bounded LLM critics. A second model in critic role, but only when grounded against an immutable ground-truth source — a schema, a fact database, a reference set. The trap is the hallucination echo chamber: if the generator and the critic share weights and training data, they often agree on plausible-sounding nonsense. The audit identifies surfaces where a critic-only-with-ground-truth pattern fits and where it would just be a co-conspirator.

Methodology

The audit follows a fixed protocol so the deliverable is consistent across engagements.

Step 1 — Surface inventory. Walk the buyer's deployed AI surfaces. Customer-facing surfaces (chat, search, recommendation), internal surfaces (drafting tools, summarization, data extraction), and machine-to-machine surfaces (tool calls, automated decisions, system-of-record writes). Catalogue every surface where AI output reaches a customer, a database, an external API, or a system of record.

Step 2 — Classification. Each surface gets classified deterministic-required or bounded-probabilistic. The test is not technical preference; it is liability exposure. Medical records, drug dosing, contract clauses, regulated lookups, financial-limit decisions, lead-form schema submissions — these are deterministic-required. Persona-driven chat tone, topic ideation, marketing copy variants — these are bounded-probabilistic. The same surface can have both a deterministic component and a probabilistic component; the audit will name the split.

Step 3 — Gate mapping. For each deterministic surface, identify which of the five validator primitives fits and what its scope is. Some surfaces need one primitive (a schema gate alone). Some need a stack (schema gate + structural rule evaluator + threshold scorer). The audit names the stack.

Step 4 — Liability ordering. Surfaces are ordered by liability exposure. The deterministic-required surface with no gate AND high liability is at the top. The deterministic surface with a partial gate AND lower liability is below. The probabilistic surface with no gate is below that, with a note explaining why a gate isn't necessary.

Step 5 — Written deliverable + live briefing. The gap report is written within 3 business days of the audit session. The live briefing walks through it page by page. Buyer takes the report. iSM does not leave the room with it unread.

Who this is for

  • Mid-market and enterprise organizations in regulated industries. Healthcare, legal services, financial services, insurance, manufacturing, professional services with compliance exposure. iSM does not work with SMB; the audit is priced and scoped for organizations with at least one full-time technical leader responsible for AI deployments.
  • Buyers with shipped AI in production OR within 90 days of shipping. The audit produces concrete remediation, not strategy. If AI is more than 90 days from production, you are not yet ready for a Validator Gap Audit; we recommend the AEO Authority Audit or a generative AI infrastructure consultation instead.
  • Decision-makers in CIO, CTO, Chief Risk Officer, VP of Engineering, or Head of AI roles. The audit's output is engineering-grade. Buyers without technical authority will struggle to translate the deliverable into action.
If your AI deployment touches customer data, regulated workflows, or systems of record — and your organization will face an audit, a regulatory inquiry, or a compliance review at any point in the next 24 months — the Validator Gap Audit is the cheapest insurance policy you can buy on that exposure.

What this is NOT

  • Not a model-selection consultation. We don't tell you which Claude version to use or how to write a system prompt. The audit assumes the model is upstream and validators are downstream.
  • Not a transformation roadmap. No 90-day plan, no 12-month strategy, no executive workshop deck. Engineering audit, written deliverable, live briefing. That's it.
  • Not a compliance attestation. The gap report is not a SOC 2 report, not a HIPAA assessment, not a regulatory filing. It is engineering's-eye-view diagnostic that informs what compliance work needs to follow.
  • Not a vendor pitch. iSM does not sell the audit as a thinly-disguised funnel into our own products. The deliverable stands on its own — most clients implement the recommendations with their internal team. The 60-day audit-credit-toward-build is real but optional.

Next step

If you have AI in production and want to know exactly where the validator gaps are, book the Validator Gap Audit. We do one per week. Audit fee is $2,500 paid upfront, gap report delivered within 5 business days, and the live briefing scheduled at the buyer's convenience within 10 business days of payment.

For organizations that want context before committing, the substrate-A post on the determinism gap lays out the full architectural taxonomy and walks through real iSM-deployed examples. Read that first; book the audit when you can name three surfaces in your own stack that you're not sure how to classify.

Get Started

Ready to Get Started?

Let's discuss how we can help your brand dominate.

Schedule a Call
Quick Inquiry
I could not be happier with this company! I have had two websites designed by them and the whole experience was amazing. Their technology and skills are top of the line and their customer service is excellent.
Dr Millicent Rovelo
Beverly Hills
Apex Architecture

Every site we build runs on Apex — sub-500ms, AI-native, zero maintenance.

Explore Apex Architecture

Stay Ahead of the Curve

AI strategies, case studies & industry insights — delivered monthly.

K