Kale — Prospeo Flow Effectiveness

01The three axes

We judge the Prospeo flow on three things: what a usable lead costs, how many records survive each stage (throughput — the funnel), and how long a batch takes plus what bounds it (speed). Numbers below are measured on the Riipen Alberta full drain (45,166 sourced, confidence-1.0) and the Terraboost 5-city drain.

Cost / usable lead

$0.015–0.02

per sendable-today lead (Microsoft parked), varies by client. ~$0.005–0.008 per send-ready before the MX gate.

Throughput (per-stage)

41% → 7–10%

of sourced get an email (full waterfall); 7–10% survive to sendable. The funnel is where volume is won — most loss is email-find + the MX gate.

Speed (bound)

Firecrawl 50

crawl is the bottleneck, capped at Firecrawl's 50-concurrency. Does NOT scale with more workers — they contend for the same pool.

The #1 lever is the Microsoft send-gate — not the source or the tools.

A live DNS re-resolve of the Riipen valid set (14,946 records, 2026-06-21) found 76% on Microsoft (17% Google, 5% other custom, 2% gateway) — infrastructure we can’t send to today. Only ~21.5% is sendable. Every cost figure has two versions: send-ready (what the flow produces) and sendable-today (what we can actually mail). The gap is ~4–5×. Unlocking Microsoft sending is worth more than any sourcing optimization.

02The flow

Prospeo is contact-first: it finds named people by ICP cheaply, but returns the email masked. The email is recovered by a cost-ordered waterfall — cheapest tool first, paid fallback last.

source gate / filter drop point enrich / transform

Prospeo searchprospeo_search$0.0005/contact · email masked

→

AI qualifyprospeo_qualify (gpt-4o-mini)gate on company desc · drop off-ICP

→

Email find (sync)QuickEnrich → crawlQE flat · crawl Firecrawl-bound

→

MX send-gatemx_capture (free)gate: park non-sendable (Microsoft) — held, not dropped

Recencycontact_recency_filter 90d (free)bucket never / aged · suppress recent

→

VerifyReoon (bulk)paid ~$0.0004 · runs AFTER the free gates · drop invalid

→

Category → copystring_transformfree

→

Route & pushrecency × finder × email-typeBison (work) / Instantly (personal)

Order is cost-driven: the free gates (MX send-gate, recency) run before paid Reoon verification, so we never spend a verify credit on an email we’d only park (non-sendable MX) or suppress (recently contacted). The MX send-gate is a gate, not a drop — non-sendable (Microsoft) leads are parked for when sending infra exists, not discarded.

Why a waterfall, not one finder

No single tool finds every email. QuickEnrich (flat-fee, LinkedIn→email) catches ~25% of sourced; website crawl mops up the QE-misses and is where most of the incremental yield comes from. AI Ark is no longer in the synchronous waterfall — its bulk-export poll adds a ~300s barrier per sub-batch (in the live BC drain it returned 6 emails vs crawl’s ~960 in the same window), so it’s too slow to sit on the send-path. It moves to an async backfill instead (see §05). Bigger lever still: a prospeo_unmask step (reveal the verified masked email Prospeo already holds) would catch ~34% of QE-misses before crawl even runs.

03Cost measured

What a usable lead costs, by tool and by client. The full per-stage funnel that drives these numbers lives in §04 Throughput — here we focus on dollars: the unit cost of each tool, and what a send-ready vs a sendable-today lead works out to for each client.

Unit economics

Tool	Role	Unit cost	Marginal
Prospeo	Source contacts (email masked)	$0.01 / credit · 25 contacts/cr	$0.0004 / contact
QuickEnrich	Primary email finder	$400/mo flat · ~9.9M cr headroom	~$0
Website crawl	Primary mop-up finder	Firecrawl ~$0.004/page	~$0 static / cheap render
AI Ark (bulk re-source)	Async backfill — targeted LinkedIn-URL lookup	$0.005/email returned	~$0.005–0.015/net recovery
Reoon	Verification	$960 / 2.5M credits (depletes)	~$0.0004 / email

AI Ark is one mode, not two. You hand it a list of specific contacts (by LinkedIn URL) and it bulk-resolves them — same as if it ran in-waterfall, just batched. It bills for every email it returns ($0.005 each), nothing on a miss, so cost per net-recovered email tracks its hit-rate on the contacts you submit (~$0.005 at 100% down to ~$0.015 at ~33%). We’ve dropped it from the synchronous send-path (too slow — the ~300s poll barrier) and run it as a background backfill on QE+crawl misses (see §05).

Cost per lead, by client

Metric	Riipen	Terraboost
Cost / send-ready lead	~$0.005	~$0.008
Cost / sendable-today lead	~$0.020	~$0.015
Primary cost driver	76% Microsoft parked	strict qualify (48%)

Why the numbers differ

The flat QuickEnrich fee means Prospeo sourcing + crawl renders are the main variable costs (Reoon adds ~$0.0004/email; AI Ark only fires on the async backfill). Riipen wastes spend on the back end (sources cheaply, qualifies almost everything, then the Microsoft gate discards ~78% — so cost-per-sendable is high). Terraboost wastes it on the front end (about half drop at qualify before any paid finder runs) but keeps far more after the friendlier MX gate — so it’s cheaper per sendable despite the lower qualify rate. Same flow, opposite loss points.

Heads-up — Terraboost’s qualify gate is being loosened. The 48% qualify rate reflects a strict kiosk-ICP gate that’s dropping too many viable records. As we relax it, Terraboost’s qualify rate rises toward Riipen’s, more contacts reach the (cheap, flat) finders, and its front-end loss shrinks — pushing cost-per-sourced up slightly but cost-per-sendable down (more survivors over the same fixed QE fee). These figures will be refreshed once the looser gate has a measured drain.

04Throughput measured

The funnel — standard buckets every client passes through, with how many survive each one and how much drops at each step. Same flow for both clients; qualify-rate and MX-mix move the proportions. Counts are per 10,000 contacts sourced; drop is the loss vs the previous stage.

Stage (bucket)	Riipen /10K	drop	Terraboost /10K	drop
Sourced (Prospeo search)	10,000	—	10,000	—
Qualified (ICP gate)	~9,600	−4%	~4,800	−52%
Reach-ready (denylist + dedup)	~8,400	−13%	~4,500	−6%
Email found (QE → crawl)	~4,100	−51%	~2,800	−38%
Verified send-ready (valid+risky)	~3,400	−17%	~2,000	−29%
Sendable (passes MX gate)	~730	−79%	~1,000	−50%

The two biggest drops are different per client. Riipen: email-find (−51%) and the MX gate (−79%) — back-loaded loss. Terraboost: the qualify gate (−52%, being loosened) and email-find (−38%) — front-loaded. Terraboost ends with more sendable per 10K sourced (~1,000 vs ~730) because its surviving emails clear the friendlier MX gate.

Inside the “Email found” bucket Riipen, measured

Email-find is one bucket made of several tools. Here’s who contributes what, and whether the email is a named contact (person’s mailbox) or a generic role address (info@, contact@) — which drives the greeting and routing. Per 10,000 sourced, of the ~4,100 found:

Contributor	Emails / 10K	% of found	Email type
QuickEnrich (LinkedIn → email)	~2,500	61%	named contact
Website crawl (incremental mop-up)	~1,600	39%	~60% generic / ~40% named
AI Ark (async backfill)	trickle	—	named contact
— Named contact (greet by name)	~3,140	~77%	QE + crawl named
— Generic / role (greet “there”)	~960	~23%	crawl-recovered

QuickEnrich does the majority of the work and always returns a named mailbox; crawl supplies the incremental ~39% but is where the generic/role addresses come from (~60% of crawl finds), which is why crawl-recovered leads greet “there” rather than by first name. Terraboost’s Prospeo path follows the same shape but leans less on crawl — its GMaps path supplies business emails directly inside kiosk cities.

The MX send-gate split measured

Of the verified valid set…	Riipen	Terraboost
Microsoft (parked — can’t send today)	76%	51%
Google + other custom (sendable)	21.5%	39%
Gateway / none (dropped)	~2.5%	~10%

Riipen valid set: live DNS re-resolve, n=14,946, 2026-06-21. Terraboost: 5-city drain. This single split is why Riipen’s sendable funnel collapses (−79% at the gate) while Terraboost’s only halves.

Email-find is the stage with the most headroom

QuickEnrich finds ~25% of sourced; crawl lifts the full path to ~41%. Of the records that still miss, ~34% carry a Prospeo-revealable verified email (recoverable with a prospeo_unmask step we haven’t added yet) — the other ~66% have no findable email anywhere (verified: re-running them yields 0%). So found-rate can realistically climb from ~41% toward ~55%, but there’s a hard ceiling well under 100%. After email-find, the MX gate is the next big drop.

05Speed measured

Wall-clock and what bounds it. Sourcing is fast; the email-find waterfall governs the clock, and it’s concurrency-bound, not worker-bound.

Stage	Concurrency / limit	What bounds it
Prospeo sourcing	~1.6/s (100/min)	API rate limit · 500K = ~3.3 hrs
QuickEnrich	conc 8 · 900/min	rarely the bottleneck
Website crawl	conc 48 · Firecrawl 50 cap	the bottleneck — one batch saturates Firecrawl
AI Ark backfill	300s poll / sub-batch	~300s async barrier — why it’s off the send-path
Reoon verify	bulk	not a bottleneck

More workers don’t make crawl faster

Website crawl already runs at concurrency 48 within a single batch — one batch nearly maxes Firecrawl’s 50-concurrency ceiling. Running many parallel Phase-B workers oversubscribes the same 50-slot pool (the limiter throttles start-rate, not in-flight concurrency), causing 429 backoff — which is exactly why some chunks in the live drain ran multi-hour. To go faster: (1) raise the Firecrawl concurrency plan (50→200 ≈ 4×), or (2) cut crawl volume by lifting upstream find-rate (e.g. prospeo_unmask). Not more workers. The lever for cost is the Microsoft gate; the lever for speed is the Firecrawl ceiling.

Recommended: split the send-path into fast-sync + async backfill

The two slowest tools (crawl, Firecrawl-bound; AI Ark, ~300s poll) are exactly the two that don’t need to block a drain. Proposed shape:

Synchronous send-path (fast): Prospeo → QuickEnrich → (prospeo_unmask) → verify → MX gate → route. QE is flat-fee and ~900/min; unmask is instant. A drain finishes in near-sourcing time and the fast-found leads (the majority — QE alone is ~61% of all finds, all named contacts) ship the same day.

Async backfill (supplemental): the QE/unmask misses queue to website crawl and AI Ark running in the background at their own pace. As emails come back, they re-enter at verify → MX gate → route and top up the campaigns over the following hours/days.

My take: do it — it directly removes the Firecrawl ceiling and the AI Ark poll from the critical path, so drains stop stalling and we stop discovering deadlocks the hard way. Two prerequisites: (1) the backfill must run on a restart-survivable worker, not the current in-process fire-and-forget drain (a deploy kills that mid-run); and (2) it must dedup against already-shipped contacts so backfilled emails supplement rather than re-add. Net effect: same total yield, far better wall-clock, and crawl/AI Ark cost becomes a trickle decoupled from send velocity.

06Projection — client buildout monthly $ @ 30K sends/wk

A typical client runs ~30K sends/week. On a 2-step sequence that’s roughly ~15K new mailable leads/week entering at steady state (~65K mailable leads/month). Because the funnels differ, the sourcing volume and cost to feed that target differ by client. Every dollar figure in this table is a monthly cost.

Per client — monthly @ 30K sends/wk	Riipen	Terraboost
Sendable / sourced (parked)	~7.3%	~10%
Contacts to source / month	~890K	~650K
Prospeo credits / month	~35,600	~26,000
Prospeo $ / month (@ $0.01/cr)	~$356	~$260
Total $ / month — Microsoft parked	~$1,400–2,100	~$1,100–1,700
Total $ / month — Microsoft unlocked	~$600–1,000	~$700–1,100

Variable lines (both clients): Prospeo sourcing + crawl renders are the main metered costs; QuickEnrich ($400/mo flat) doesn’t scale with volume; Reoon adds ~$0.0004/email (and depletes its 2.5M-credit pack). Microsoft-unlocked roughly halves Riipen (frees its 76% parked pile) but helps Terraboost less — it was already 49% sendable.

What this means

A 30K-sends/week client lands at ~$1,100–2,100/month parked, ~$600–1,100 with Microsoft unlocked. The flat QuickEnrich fee amortizes across every client on the flow — the 2nd and 3rd clients are cheaper per-lead than the 1st. At this volume the Prospeo PRO plan (16,667 cr/mo) must be upgraded; the $3,500/mo · 350K-credit add-on covers ~10 clients at once ($0.01/cr). For Terraboost specifically, the GMaps business-first path (email free with the scrape, $0.005/result) can undercut the Prospeo+waterfall path inside kiosk cities.

07What’s measured vs assumed

Input	Status	Source / caveat
Funnel rates (qualify, find, valid, MX split)	measured	Riipen Alberta full drain (45,166), confidence 1.0 + Terraboost 5-city
Throughput (per-stage funnel)	measured	Riipen Alberta full drain
Speed bound (Firecrawl 50-conc)	measured	website_crawl conc 48; limiter is per-process, start-rate only — parallel batches oversubscribe
QuickEnrich flat $400/mo	measured	~9.9M credit headroom on key → effectively flat at our volume
Reoon ~$0.0004/email	measured	$960 per 2.5M-credit pack — small but depletes (NOT free)
Prospeo $0.01/credit	measured	Scale price: $3,500/mo ÷ 350K-credit add-on. Per-seat tiers run $49–249/mo (2K–15K cr)
AI Ark $0.005/email returned	measured	Targeted bulk lookup by LinkedIn URL; bills only emails returned, 0 on miss. Run as async backfill (off send-path) due to ~300s poll barrier
30K sends → ~15K new leads/wk	assumption	2-step sequence; adjust if sequence depth differs
Microsoft-sendable %	measured	Live DNS re-resolve 2026-06-21 (n=14,946); re-resolve again before each send cycle