web_audit()
Read the company's site like a human would. Pricing page, careers page, changelog, blog, status page.
Inputs and outputs
In. Canonical company record from discover().
Out. Site map, last-modified dates, page-type tags, raw text excerpts capped at 12k tokens per page.
Current version
v2.1.0 Forkable through the prompt library.
Model defaults
Headless fetch through our crawler pool. No LLM at this stage. The classifier that tags page type is a small fine-tune we host ourselves. You can override per stage in your routing config.
How it fails
Cloudflare bot pages, JavaScript-heavy SPAs that do not server-render, and paywalls. We retry once with a real browser and then mark the lead web_audit_partial. Partial leads can still pass downstream stages but they get a lower confidence score.
Evals
Page-type classification accuracy: 93.4% across 18 page types. Pricing-page detection specifically: 98.2%. The full eval set ships with the prompt and is editable. Eval format.