← Product / Stage 05 · Pain Signals

Pipeline · Stage 05

The pack picks
the signals.

Pain signals are not a single global list. Every vertical pack ships its own taxonomy — built by an operator who has actually sold in that industry and knows which observations correlate with a reply. The pipeline can only fire pitch angles from signals the pack defines. The model can't compliment a lead on a strength they don't have.

Sample taxonomies — three packs, three vocabularies

Different industry, different signals.
Same pipeline.

The pack is the source of truth for what counts as a pain signal. An operator who has shipped outbound in the industry decides which observations actually correlate with a reply. Below, signals from three live packs.

Devtools pack tuned by ex-Datadog SRE

kubernetes_sprawl observability_gap on_call_burnout missing_oss_repo no_changelog_visible stale_runbooks + 18 more

Clinical Ops pack tuned by clinic admin

legacy_emr_lock manual_intake_forms no_patient_portal no_telehealth staffing_burnout prior_auth_backlog + 14 more

Fintech pack tuned by ex-Stripe lead

manual_kyc single_payment_gateway no_audit_trail pci_self_assessment reconciliation_manual no_chargeback_workflow + 16 more

A signal in one pack can be meaningless in another. no_chargeback_workflow is a sales hook for fintech and noise everywhere else. no_dealer_locator matters for industrial manufacturers and is irrelevant for SaaS. The pack draws the line. You can fork a pack, rename signals, add your own — renames propagate through every pitch angle that referenced them.

In / Out

What goes in, what comes out.

Inputs

Audit + Taxonomy

lead.webAudit — Perplexity structured extraction (channels, traffic, ecommerce maturity, content recency)
lead.htmlSnapshot — homepage + key page captures
lead.socialResearch — follower counts, post recency
taxonomy.signals[] — controlled vocabulary, per-vertical extensions
taxonomy.derivationRules[] — declarative rules (e.g. no_dealer_locator if HTML lacks "locator")

Outputs

PainSignal[]

painSignal.name — must match taxonomy entry
painSignal.confidence — 0–1, with derivation source
painSignal.evidence — quoted text or URL anchor
painSignal.derivedFrom — rule | llm | manual
painSignal.frequencyCap — max signals retained per lead

How it works

Rules first.
LLM second.
Operator last.

Signal extraction is a three-pass operation. The first pass runs declarative rules over the structured audit and HTML snapshot. no_dealer_locator fires if the HTML lacks any anchor or section labelled "locator", "find a dealer", or equivalent in the audit. slow_load_>3s fires from the Lighthouse-style timing in the audit. These rules are deterministic, free, and explain themselves.

The second pass is LLM-driven extraction (Perplexity Sonar with structured output) over the same inputs. It can name signals the rule pass missed — most signals about strategy and positioning fall into this bucket. The model is bound to the taxonomy: it can only emit signal names that already exist. New signal proposals go into a separate candidates table for human curation.

The third pass is the operator. Every signal carries provenance — rule, LLM, or manual — and an evidence quote. You can override a signal that fired wrong, add a signal the system missed, or downvote a signal that the model overuses. Overrides feed back into the prompt registry as few-shot examples on the next version bump. The taxonomy itself is editable: rename, deprecate, merge, or split signals, and every solution template that referenced the old name updates by reference.

The prompt

Bound to the vocabulary.

--- system ---
You extract pain signals from a B2B company's web presence.
You may ONLY emit signal names from the provided taxonomy. If you
identify a pain that does not match any taxonomy entry, write it to
candidates[] for human review — never invent a name in the output.

--- inputs ---
audit:        {{lead.webAudit | json}}
htmlSnapshot: {{lead.htmlSnapshot | truncate(8000)}}
social:       {{lead.socialResearch | json}}
taxonomy:     {{taxonomy.signals[] | json}}      # 28 entries

--- output schema ---
{
  "signals": [
    {
      "name":       string  # MUST be in taxonomy,
      "confidence": 0..1,
      "evidence":   string  # quoted text or URL anchor
    }
  ],
  "candidates": [
    { "proposedName": string, "rationale": string }
  ]
}

--- guards ---
- frequency_cap: max 6 signals per lead. Drop lowest confidence first.
- evidence required: signals without an evidence quote are dropped.
- on missing input: do NOT emit. Write field name to missing_required.

# <!-- PLACEHOLDER — full prompt registry available in app -->

Failure modes & safeguards

What can break.
And what catches it.

Risk

Signal over-detection

Every lead gets the same eight signals, pitches start to look templated.

Mitigation

Frequency cap per lead (default 6). Lowest-confidence signals drop first. Per-signal saturation alerts in the registry.

Risk

Hallucinated signal name

Model emits a plausible-sounding signal that no template references.

Mitigation

Schema-bound output. Names not in taxonomy are silently dropped to candidates[] for curation.

Risk

Evidence-free claims

A signal fires but the operator can't see why.

Mitigation

Every signal must carry an evidence quote or URL anchor. Signals without evidence are dropped server-side.

Where it sits

02
Audit 03
Social 05
Signals 07
Enrich 08
Draft 09
Review 10
Sent 11
Funnel

Extract pain signals on a real lead.
In about 12 seconds.

Drop a domain. Watch the rules and the LLM agree (or argue) in the sandbox.

Try the Sandbox Talk to Sales

The pack picks the signals.

Different industry, different signals. Same pipeline.

What goes in, what comes out.

Rules first. LLM second.Operator last.

Bound to the vocabulary.

What can break. And what catches it.

Extract pain signals on a real lead. In about 12 seconds.

The pack picks
the signals.

Different industry, different signals.
Same pipeline.

Rules first.
LLM second.
Operator last.

What can break.
And what catches it.

Extract pain signals on a real lead.
In about 12 seconds.