The Sandbox Tour: What 7 Founders Did in Their First 30 Minutes

The Sandbox is free. No card required. It runs against a seeded dataset — real company structures, anonymized — so you see the pipeline work without uploading your own list.

We watch what people do in there. Not in a surveillance-product way. In a "we need to understand where the interface fails" way. Seven founder sessions from Q1 2026 taught us more about the product than any internal review.

Who these seven are

Our first founder cohort who signed up for the Sandbox in Q1, all having found us through the Manifesto page rather than paid acquisition. This matters. They came in with a specific expectation — a research-first workflow — and evaluated whether the product matched it.

Their backgrounds: two in devtools, one in fintech SaaS, one in logistics, two in healthcare adjacent (not clinical ops), one running a boutique agency for B2B clients. All were doing sales themselves. None had hired an SDR. Four had tried at least one agency in the prior 18 months.

We observed session recordings and followed up with brief async interviews. What follows is what they actually did, not what we expected them to do.

The first five minutes: everyone goes to the same place

Every one of the seven went directly to Stage 05 — Pain Signals — within the first five minutes. Not Stage 01 (Discovery). Not Stage 04 (Qualification). Pain Signals.

The reason, when we asked: they wanted to see if the signals were real. Not impressive-sounding. Real.

Three of the seven had previously seen outbound tools that promised "AI personalization" and delivered merge fields with a company description pulled from a CrunchBase summary. Their bar for what counted as a real signal was specific: it had to be something they could not have found in 30 seconds on Google, and it had to be something a prospect would recognize as coming from actual research.

The devtools founder put it plainly: "I wanted to see if the email could have been sent to a different company. If it could, it's not a real signal."

That is the two-company test, unprompted.

What they noticed about the signals

Two founders flagged the source citations immediately. Every signal in Stage 05 output includes a source_url and a signal_date. The clinical-adjacent founder said she had never seen a vendor tool cite its sources. She read three of them to check that the links actually went somewhere useful.

They did. This was not obvious to us as a differentiator before these sessions. We built source citations because Principle 6 requires it — Transparent prompts. Versioned angles — and because removing uncheckable observations was a quality improvement, not a feature. The founders treated it as a feature. We updated the Sandbox onboarding copy to surface it earlier.

One founder found a stale signal in the seeded dataset. The observable was a job posting with a signal_date of 91 days prior; the decay flag should have fired at 90 days. It had not. He flagged it. We fixed a threshold bug in the decay logic that had been producing off-by-one errors on exact-day boundaries. Good bug to find.

The Stage 06 moment

Most founders spent the most time in Stage 06 — the Pitch Brief. This surprised us. We expected Stage 08 (Email Drafted) to be the destination. The brief turned out to be more interesting.

The brief shows the operator exactly what the model is working with before the email is written: the pain signals selected, the contact's role, the pitch angle chosen, the evidence cited for that angle, and the competing angles that were considered and rejected. It is a readable document, not a data structure.

The fintech founder described it as "the thing I would write before I started a call." She was right. The brief is a call-prep document that happens to also produce an email.

Two founders tried to edit the brief before running Stage 08. The interface allows this. One of them added a competitor mention that was specific to a relationship she knew about; the brief is freeform-editable. The resulting email reflected the edit. She approved it at Stage 09 without modification.

This is the intended flow for accounts an operator knows well. Brief editing is the operator adding context the model does not have. Stage 07 (Contact Enrichment) and Stage 08 (Email Drafted) are downstream of that context.

What nobody did

Nobody ran Stage 01 (Discovery) first. Not one of the seven.

This was our most important finding. The mental model most operators bring to a sales tool is: find leads, qualify them, contact them. Discovery is logically first. In practice, every founder we observed wanted to validate the research quality before worrying about discovery.

They uploaded the seeded dataset at Stage 01 as a pass-through — or just started at Stage 04 or Stage 05 — because the question they were answering was: Does this tool do research the way I would?

The product answers that question most efficiently by showing Stage 05 first. We changed the default Sandbox entry point to Stage 05 three weeks after these sessions. Time-to-first-impressed-moment (measured by the first session where a founder navigated beyond Stage 07) dropped from 22 minutes to 11 minutes.

I wanted to see if the email could have been sent to a different company. If it could, it's not a real signal.

The human review reaction

Every founder reached Stage 09 (Human Review). Every one of them stopped there and spent time with the keyboard shortcuts.

The interface is keyboard-first: J to reject, E to edit, Enter to approve, X to skip with a tag. At full clip, an experienced operator can triage 200 drafts per hour. Nobody hits that in their first session.

Three founders asked the same question in the follow-up interview: "Is this actually faster than just writing the email myself?"

It is a fair question for a founder doing 20 sends per week. For 20 sends, the pipeline is not primarily a time-save. It is a research amplifier. The time you save is not in the drafting; it is in the Tuesday you do not lose reading 10-Ks, job boards, and competitor pricing pages.

The devtools founder who had been doing all his own research understood this immediately. He estimated he was spending 6–8 hours per week on lead research. He said: "If this takes that down to 90 minutes, I get a day back."

We did not claim a number. He arrived at his own number after seeing Stage 02 and Stage 05 work on the seeded data. That is the right way for that claim to be made.

The questions nobody asked

Nobody asked about AI model selection. Nobody asked about integrations. Nobody asked about team permissions or RBAC.

Every question was about the quality bar. How does the system decide what counts as a signal? What happens if the signal is wrong? Who writes the email — the model or a human? Can I change the prompt?

These are the questions the Manifesto is written to answer. The founders who came in through the Manifesto page had read it. They were testing whether the product matched it. Mostly, they concluded it did.

One of them, near the end of his session, typed a brief note into the Sandbox feedback field. We have it on record. He wrote: "This is the first tool I've seen that does the thing it says it does."

We printed that. It is on the wall in Brooklyn.

What changed after these sessions

Three product changes came directly from the seven-session analysis.

The Sandbox entry point moved to Stage 05. Time-to-first-impressed-moment halved.

The signal decay logic threshold bug was fixed. A small number of stale signals that should have been flagged were not.

The Pitch Brief interface added an explicit "competing angles considered" section, surfacing the two angles the model evaluated and did not choose. Founders wanted to understand the selection reasoning, not just the selection. Showing the rejected angles with their evidence gap noted addressed this without adding a configuration step.

Receipts

Session data: 7 founder Sandbox sessions, Q1 2026. Observational study, small sample.

First stage visited by all our first founder cohort (seven sessions) : Stage 05 (Pain Signals)
Founders who ran Stage 01 (Discovery) first: 0
Average time from session start to Stage 06 (Pitch Brief): 7.3 minutes
Average total session length: 28 minutes
Founders who reached Stage 09 (Human Review): 7 of 7
Sandbox-to-Core conversion rate for this cohort: 4 of 7 (57%) within 30 days
Time-to-first-impressed-moment before Sandbox entry-point change: 22 minutes
Time-to-first-impressed-moment after change: 11 minutes

Closing

Principle 1 — Research is the product — is a claim. The Sandbox is where that claim is tested. Our first founder cohort came in skeptical. They went to Stage 05 first because they wanted to see if the research was real.

The ones who concluded it was real converted at 57%. The ones who did not had found a legitimate gap — in one case, a signal that was stale because of a bug we subsequently fixed. That feedback loop is why we run the Sandbox the way we do.

We owe you a tool that does what it says. The Sandbox is the most honest version of that offer.

Related:

— Rosa Marin , GTM Operator
Principle 1 — Research is the product.