Kale

Prospeo Flow — Effectiveness

cost · throughput · speed · sync path Prospeo → QE → crawl (AI Ark async backfill) · Riipen + Terraboost · updated 2026-06-22

01The three axes

We judge the Prospeo flow on three things: what a usable lead costs, how many records survive each stage (throughput — the funnel), and how long a batch takes plus what bounds it (speed). Numbers below are measured on the Riipen Alberta full drain (45,166 sourced, confidence-1.0) and the Terraboost 5-city drain.

Cost / usable lead
$0.015–0.02
per sendable-today lead (Microsoft parked), varies by client. ~$0.005–0.008 per send-ready before the MX gate.
Throughput (per-stage)
41% → 7–10%
of sourced get an email (full waterfall); 7–10% survive to sendable. The funnel is where volume is won — most loss is email-find + the MX gate.
Speed (bound)
Firecrawl 50
crawl is the bottleneck, capped at Firecrawl's 50-concurrency. Does NOT scale with more workers — they contend for the same pool.

The #1 lever is the Microsoft send-gate — not the source or the tools.

A live DNS re-resolve of the Riipen valid set (14,946 records, 2026-06-21) found 76% on Microsoft (17% Google, 5% other custom, 2% gateway) — infrastructure we can’t send to today. Only ~21.5% is sendable. Every cost figure has two versions: send-ready (what the flow produces) and sendable-today (what we can actually mail). The gap is ~4–5×. Unlocking Microsoft sending is worth more than any sourcing optimization.

02The flow

Prospeo is contact-first: it finds named people by ICP cheaply, but returns the email masked. The email is recovered by a cost-ordered waterfall — cheapest tool first, paid fallback last.

source gate / filter drop point enrich / transform
Prospeo searchprospeo_search$0.0005/contact · email masked
AI qualifyprospeo_qualify (gpt-4o-mini)gate on company desc · drop off-ICP
Email find (sync)QuickEnrich → crawlQE flat · crawl Firecrawl-bound
VerifyReoon (bulk)free · drop invalid
MX send-gatemx_capturedrop non-sendable (Microsoft parked)
Recencycontact_recency_filter 90dbucket: never / aged / recent
Category → copystring_transformfree
Route & pushrecency × finder × email-typeBison (work) / Instantly (personal)

Why a waterfall, not one finder

No single tool finds every email. QuickEnrich (flat-fee, LinkedIn→email) catches ~25% of sourced; website crawl mops up the QE-misses and is where most of the incremental yield comes from. AI Ark is no longer in the synchronous waterfall — its bulk-export poll adds a ~300s barrier per sub-batch (in the live BC drain it returned 6 emails vs crawl’s ~960 in the same window), so it’s too slow to sit on the send-path. It moves to an async backfill instead (see §05). Bigger lever still: a prospeo_unmask step (reveal the verified masked email Prospeo already holds) would catch ~34% of QE-misses before crawl even runs.

03Cost measured

What a usable lead costs, by tool and by client. The full per-stage funnel that drives these numbers lives in §04 Throughput — here we focus on dollars: the unit cost of each tool, and what a send-ready vs a sendable-today lead works out to for each client.

Unit economics

ToolRoleUnit costMarginal
ProspeoSource contacts (email masked)$0.01 / credit · 25 contacts/cr$0.0004 / contact
QuickEnrichPrimary email finder$400/mo flat · ~9.9M cr headroom~$0
Website crawlPrimary mop-up finderFirecrawl ~$0.004/page~$0 static / cheap render
AI Ark (bulk re-source)Async backfill — targeted LinkedIn-URL lookup$0.005/email returned~$0.005–0.015/net recovery
ReoonVerification$960 / 2.5M credits (depletes)~$0.0004 / email

AI Ark is one mode, not two. You hand it a list of specific contacts (by LinkedIn URL) and it bulk-resolves them — same as if it ran in-waterfall, just batched. It bills for every email it returns ($0.005 each), nothing on a miss, so cost per net-recovered email tracks its hit-rate on the contacts you submit (~$0.005 at 100% down to ~$0.015 at ~33%). We’ve dropped it from the synchronous send-path (too slow — the ~300s poll barrier) and run it as a background backfill on QE+crawl misses (see §05).

Cost per lead, by client

MetricRiipenTerraboost
Cost / send-ready lead~$0.005~$0.008
Cost / sendable-today lead~$0.020~$0.015
Primary cost driver76% Microsoft parkedstrict qualify (48%)

Why the numbers differ

The flat QuickEnrich fee means Prospeo sourcing + crawl renders are the main variable costs (Reoon adds ~$0.0004/email; AI Ark only fires on the async backfill). Riipen wastes spend on the back end (sources cheaply, qualifies almost everything, then the Microsoft gate discards ~78% — so cost-per-sendable is high). Terraboost wastes it on the front end (about half drop at qualify before any paid finder runs) but keeps far more after the friendlier MX gate — so it’s cheaper per sendable despite the lower qualify rate. Same flow, opposite loss points.

Heads-up — Terraboost’s qualify gate is being loosened. The 48% qualify rate reflects a strict kiosk-ICP gate that’s dropping too many viable records. As we relax it, Terraboost’s qualify rate rises toward Riipen’s, more contacts reach the (cheap, flat) finders, and its front-end loss shrinks — pushing cost-per-sourced up slightly but cost-per-sendable down (more survivors over the same fixed QE fee). These figures will be refreshed once the looser gate has a measured drain.

04Throughput measured

The funnel — standard buckets every client passes through, with how many survive each one and how much drops at each step. Same flow for both clients; qualify-rate and MX-mix move the proportions. Counts are per 10,000 contacts sourced; drop is the loss vs the previous stage.

Stage (bucket)Riipen /10KdropTerraboost /10Kdrop
Sourced (Prospeo search)10,00010,000
Qualified (ICP gate)~9,600−4%~4,800−52%
Reach-ready (denylist + dedup)~8,400−13%~4,500−6%
Email found (QE → crawl)~4,100−51%~2,800−38%
Verified send-ready (valid+risky)~3,400−17%~2,000−29%
Sendable (passes MX gate)~730−79%~1,000−50%

The two biggest drops are different per client. Riipen: email-find (−51%) and the MX gate (−79%) — back-loaded loss. Terraboost: the qualify gate (−52%, being loosened) and email-find (−38%) — front-loaded. Terraboost ends with more sendable per 10K sourced (~1,000 vs ~730) because its surviving emails clear the friendlier MX gate.

Inside the “Email found” bucket Riipen, measured

Email-find is one bucket made of several tools. Here’s who contributes what, and whether the email is a named contact (person’s mailbox) or a generic role address (info@, contact@) — which drives the greeting and routing. Per 10,000 sourced, of the ~4,100 found:

ContributorEmails / 10K% of foundEmail type
QuickEnrich (LinkedIn → email)~2,50061%named contact
Website crawl (incremental mop-up)~1,60039%~60% generic / ~40% named
AI Ark (async backfill)tricklenamed contact
— Named contact (greet by name)~3,140~77%QE + crawl named
— Generic / role (greet “there”)~960~23%crawl-recovered

QuickEnrich does the majority of the work and always returns a named mailbox; crawl supplies the incremental ~39% but is where the generic/role addresses come from (~60% of crawl finds), which is why crawl-recovered leads greet “there” rather than by first name. Terraboost’s Prospeo path follows the same shape but leans less on crawl — its GMaps path supplies business emails directly inside kiosk cities.

The MX send-gate split measured

Of the verified valid set…RiipenTerraboost
Microsoft (parked — can’t send today)76%51%
Google + other custom (sendable)21.5%39%
Gateway / none (dropped)~2.5%~10%

Riipen valid set: live DNS re-resolve, n=14,946, 2026-06-21. Terraboost: 5-city drain. This single split is why Riipen’s sendable funnel collapses (−79% at the gate) while Terraboost’s only halves.

Email-find is the stage with the most headroom

QuickEnrich finds ~25% of sourced; crawl lifts the full path to ~41%. Of the records that still miss, ~34% carry a Prospeo-revealable verified email (recoverable with a prospeo_unmask step we haven’t added yet) — the other ~66% have no findable email anywhere (verified: re-running them yields 0%). So found-rate can realistically climb from ~41% toward ~55%, but there’s a hard ceiling well under 100%. After email-find, the MX gate is the next big drop.

05Speed measured

Wall-clock and what bounds it. Sourcing is fast; the email-find waterfall governs the clock, and it’s concurrency-bound, not worker-bound.

StageConcurrency / limitWhat bounds it
Prospeo sourcing~1.6/s (100/min)API rate limit · 500K = ~3.3 hrs
QuickEnrichconc 8 · 900/minrarely the bottleneck
Website crawlconc 48 · Firecrawl 50 capthe bottleneck — one batch saturates Firecrawl
AI Ark backfill300s poll / sub-batch~300s async barrier — why it’s off the send-path
Reoon verifybulknot a bottleneck

More workers don’t make crawl faster

Website crawl already runs at concurrency 48 within a single batch — one batch nearly maxes Firecrawl’s 50-concurrency ceiling. Running many parallel Phase-B workers oversubscribes the same 50-slot pool (the limiter throttles start-rate, not in-flight concurrency), causing 429 backoff — which is exactly why some chunks in the live drain ran multi-hour. To go faster: (1) raise the Firecrawl concurrency plan (50→200 ≈ 4×), or (2) cut crawl volume by lifting upstream find-rate (e.g. prospeo_unmask). Not more workers. The lever for cost is the Microsoft gate; the lever for speed is the Firecrawl ceiling.

Recommended: split the send-path into fast-sync + async backfill

The two slowest tools (crawl, Firecrawl-bound; AI Ark, ~300s poll) are exactly the two that don’t need to block a drain. Proposed shape:

Synchronous send-path (fast): Prospeo → QuickEnrich → (prospeo_unmask) → verify → MX gate → route. QE is flat-fee and ~900/min; unmask is instant. A drain finishes in near-sourcing time and the fast-found leads (the majority — QE alone is ~61% of all finds, all named contacts) ship the same day.

Async backfill (supplemental): the QE/unmask misses queue to website crawl and AI Ark running in the background at their own pace. As emails come back, they re-enter at verify → MX gate → route and top up the campaigns over the following hours/days.

My take: do it — it directly removes the Firecrawl ceiling and the AI Ark poll from the critical path, so drains stop stalling and we stop discovering deadlocks the hard way. Two prerequisites: (1) the backfill must run on a restart-survivable worker, not the current in-process fire-and-forget drain (a deploy kills that mid-run); and (2) it must dedup against already-shipped contacts so backfilled emails supplement rather than re-add. Net effect: same total yield, far better wall-clock, and crawl/AI Ark cost becomes a trickle decoupled from send velocity.

06Projection — client buildout monthly $ @ 30K sends/wk

A typical client runs ~30K sends/week. On a 2-step sequence that’s roughly ~15K new mailable leads/week entering at steady state (~65K mailable leads/month). Because the funnels differ, the sourcing volume and cost to feed that target differ by client. Every dollar figure in this table is a monthly cost.

Per client — monthly @ 30K sends/wkRiipenTerraboost
Sendable / sourced (parked)~7.3%~10%
Contacts to source / month~890K~650K
Prospeo credits / month~35,600~26,000
Prospeo $ / month (@ $0.01/cr)~$356~$260
Total $ / month — Microsoft parked~$1,400–2,100~$1,100–1,700
Total $ / month — Microsoft unlocked~$600–1,000~$700–1,100

Variable lines (both clients): Prospeo sourcing + crawl renders are the main metered costs; QuickEnrich ($400/mo flat) doesn’t scale with volume; Reoon adds ~$0.0004/email (and depletes its 2.5M-credit pack). Microsoft-unlocked roughly halves Riipen (frees its 76% parked pile) but helps Terraboost less — it was already 49% sendable.

What this means

A 30K-sends/week client lands at ~$1,100–2,100/month parked, ~$600–1,100 with Microsoft unlocked. The flat QuickEnrich fee amortizes across every client on the flow — the 2nd and 3rd clients are cheaper per-lead than the 1st. At this volume the Prospeo PRO plan (16,667 cr/mo) must be upgraded; the $3,500/mo · 350K-credit add-on covers ~10 clients at once ($0.01/cr). For Terraboost specifically, the GMaps business-first path (email free with the scrape, $0.005/result) can undercut the Prospeo+waterfall path inside kiosk cities.

07What’s measured vs assumed

InputStatusSource / caveat
Funnel rates (qualify, find, valid, MX split)measuredRiipen Alberta full drain (45,166), confidence 1.0 + Terraboost 5-city
Throughput (per-stage funnel)measuredRiipen Alberta full drain
Speed bound (Firecrawl 50-conc)measuredwebsite_crawl conc 48; limiter is per-process, start-rate only — parallel batches oversubscribe
QuickEnrich flat $400/momeasured~9.9M credit headroom on key → effectively flat at our volume
Reoon ~$0.0004/emailmeasured$960 per 2.5M-credit pack — small but depletes (NOT free)
Prospeo $0.01/creditmeasuredScale price: $3,500/mo ÷ 350K-credit add-on. Per-seat tiers run $49–249/mo (2K–15K cr)
AI Ark $0.005/email returnedmeasuredTargeted bulk lookup by LinkedIn URL; bills only emails returned, 0 on miss. Run as async backfill (off send-path) due to ~300s poll barrier
30K sends → ~15K new leads/wkassumption2-step sequence; adjust if sequence depth differs
Microsoft-sendable %measuredLive DNS re-resolve 2026-06-21 (n=14,946); re-resolve again before each send cycle