Abstract
Bot Analytics is a premium Apex module that detects, classifies, and scores automated crawler traffic across iSM-network sites. It surfaces AI engine visibility scores, crawl budget distribution, and weekly email digests in a 19-component dashboard backed by real-time DynamoDB aggregation.
Problem
Web analytics tools count bot traffic as noise to be filtered. But for sites competing in AI answer engines, the bots matter as much as the humans — which AI engines are crawling, how often, which pages they index, and whether the robots.txt is blocking the ones you want.
Standard analytics offer no signal on AI crawler behavior, no visibility score, and no way to correlate crawl patterns with content freshness or schema coverage.
Approach
Ingestion
Next.js middleware on each site intercepts every request and pattern-matches against 53 bot definitions across 6 categories (AI, Search, SEO, Monitoring, Social, Other). Matched hits are posted via `NextFetchEvent.waitUntil` to a webhook on the Apex portal, which resolves the sending domain to a tenant and performs an atomic `ADD` into a daily DynamoDB record.
Each record tracks total hits, per-bot counts, per-type counts, per-page counts, status code distribution, bot-page matrix, and hourly hit map.
Scoring
The realtime analytics engine queries BOTHIT records for the selected period, computes deltas against the previous period, and assembles the full `BotAnalyticsData` response. An AI Visibility Score (0–100) weights unique AI engines detected (40 points), page coverage (35 points), and crawl frequency (25 points).
Dashboard
The dashboard renders across four tabs: Overview, AI Intelligence, Crawl Analysis, and All Bots. An Executive Summary component writes a plain-English paragraph from the aggregated data using a typewriter animation.
An Alert Banner fires for anomalies — Googlebot drops, new bots, traffic spikes, no AI crawlers detected.
Status
- Live end-to-end: middleware deployed on all active iSM-network sites, BOTHIT records flowing into DynamoDB.
- AI Visibility Score, citation likelihood, crawl heatmap, and bot-page matrix are all functional.
- A weekly email digest (Monday 9am CT) summarizes the prior week's AI visibility and top crawlers.
- robots.txt analysis fetches and parses each tenant's robots.txt live and flags blocked AI bots.