A theKrew customer cancelled his trial recently after misreading the cold email reply rate benchmarks on his own dashboard. His note said, in essence: "the open rate is low, no responses, this isn't working."
His open rate was 73.3%.
That's roughly 3x the industry average for B2B cold email. His bounce rate was zero. He'd sent 60 emails across 2 campaigns. Across those 60 sends, he had 0 replies. He read the 0, called the open rate weak, and decided he'd seen enough.
He had not. He'd seen too little. He'd also misread the number that was actually working.
What follows is the honest version of cold email reply rate benchmarks: what the open-rate and reply-rate ranges look like by source, what they actually mean once you discount vendor-inflated marketing, and the sample size you need before any cold email campaign can be called broken. The customer is anonymous here, but the pattern is common. Founders run a small trial, fixate on one number that seems wrong, and cancel before the data is large enough to be a verdict.
If you've ever stared at a cold email dashboard wondering whether the numbers were good or bad, this is the post for you.
What's a Good Cold Email Open Rate, Really?
The most-cited number on email open rates comes from Mailchimp's public Email Marketing Benchmarks report, which puts the cross-industry average around 21% for marketing email. Broken out by vertical, it ranges from about 15-18% on the low end (large-list consumer brands) to 25-30% on the high end (professional services and B2B niches with engaged lists).
That 21% number is a useful anchor, but it has two caveats most marketing content skips over:
Caveat 1: Apple Mail Privacy Protection inflates opens. Since 2021, Apple's MPP prefetches email content for users on iCloud Mail, Apple Mail, and any iOS or Mac client pointed at iCloud. The prefetch fires the tracking pixel whether or not the user ever sees the email. The result: a meaningful portion of "opens" reported by every email platform are bots and prefetches, not human engagement. Estimates vary, but most outbound operators discount their raw open numbers by 15-25% to get to a defensible engaged-open figure.
Caveat 2: Mailchimp's benchmark is for marketing email to subscribed lists, not cold email to prospects who never asked. Cold outreach to a properly built B2B list typically lands in a 15-25% real-open range. Below the Mailchimp average, because the recipient has no prior relationship. Better deliverability, better subject lines, and a warmed sender domain can push that to 30-45% on real opens.
So when the customer above clocked a 73.3% open rate, the realistic interpretations are: - Heavy MPP prefetch contribution (most likely) - A list with unusually engaged recipients (possible) - A brand-new sender domain whose first sends landed mostly in inboxes equipped with Outlook ATP-style bot scanners, which pre-open everything (possible)
Even if you discount his 73% by half to account for inflation, his real open rate is in the 35-40% range. That's above-average healthy on cold by any source you'd care to cite.
He looked at that number and called it low.
What Cold Email Reply Rate Benchmarks Actually Tell You
Reply rate is the metric that matters. Opens tell you the subject line worked and the email reached the inbox. Replies tell you the body copy converted interest into action.
HubSpot's published sales statistics consistently put healthy B2B cold outreach reply rates at 1-5%, with anything above 5% being strong. Below 1% on a properly-sized cold campaign means the email itself isn't doing its job, regardless of what the open rate says.
Three real ranges to memorize:
- Templated cold email to a cold list: 1-2% positive reply rate. This is the volume game most outbound operators play. Send a lot, accept a small response rate, fill the pipeline with raw quantity.
- Reasonably personalized cold email (light AI personalization, basic company research, role-aware messaging): 3-5% positive reply rate. The improvement is real because the recipient feels less like a target and more like a known prospect.
- Deeply personalized cold email (specific reference to recent company news, role-relevant pain, prior interaction signal): 6-10% on warm-adjacent prospects, lower on pure cold. This is where AI changes the math, because a human sales rep cannot sustain this depth of personalization at meaningful outbound volume.
That breakdown isn't unique to theKrew. Cognism's research on AI sales agents, Lavender's published case studies, Smartlead's reports, and Lemlist's benchmarks all converge on the same three bands. The cold email industry as a whole agrees on what healthy looks like.
So when the customer above hit 0 replies on 60 sends and cancelled, what was he actually looking at? Let me show you the math.
Why 60 Sends Is Not a Verdict (The Sample Size Problem)
At a baseline 1-3% positive reply rate for cold B2B, the expected number of replies on 60 sends is:
- Low end (1%): 0.6 expected replies
- Mid (2%): 1.2 expected replies
- High end (3%): 1.8 expected replies
- Strong campaign (5%): 3 expected replies
- Deeply personalized (8%): 4.8 expected replies
The expected reply count on 60 sends, even on a strong campaign, is between 1 and 5 replies. Zero replies is within normal statistical variance for that sample size. You cannot reliably distinguish "broken campaign" from "average campaign that hasn't sent enough yet" until you've crossed roughly 200 sends, and ideally 500.
This is statistics, not magic. It applies to every cold email campaign — yours, ours, and every vendor's. If a tool promises confident reply-rate readings on 60 sends, the tool is selling you something the math doesn't support.
The customer in our story stopped at 60 sends. At that volume: - His expected replies were 0.6 to 1.8 on a healthy campaign - His actual replies were 0 - The difference between 0 and 1 is well within the random variance of a 60-send sample
If he'd let the campaign run to 300 sends and the reply count stayed at 0, that would be signal worth acting on. If a series of 50-send batches all returned zero replies across 6 batches, that would also be signal. Sixty sends in 2 weeks with zero replies, by itself, is not yet signal. It's noise.
The Decision Framework: When Is Your Cold Email Campaign Actually Broken?
Here's the diagnostic every outbound operator should know before they open a dashboard.
Check open rate first. If you're below 10%, you have a deliverability problem. Your emails aren't reaching the inbox. Audit sender reputation, SPF / DKIM / DMARC alignment, and domain warmup before sending another email. Between 10-25% is normal cold range; the inbox is working. Above 30%, you're either getting MPP inflation or you have an unusually engaged list. Either way, the inbox isn't the problem.
Then check sample size. Below 200 sends, your reply rate is mostly noise. Between 200-500, you start to see signal but it's still noisy. Above 500 sends, your reply rate is real data you can act on.
Then check reply rate. Once you have enough volume: - 0-1%: the email body is failing or the audience is wrong - 1-3%: average templated cold; working but not optimized - 3-5%: reasonably tuned, AI-personalization-aided - 5%+: strong, well-targeted, properly personalized - 8%+: exceptional, usually requires deep personalization at scale
If your numbers don't match the campaign type you think you're running, the gap tells you what to fix. Low opens with low replies points to deliverability. High opens with low replies points to body copy. Low opens with decent replies points to a working email with a failing subject line. For the deeper breakdown of how to actually read an A/B test on these metrics, we covered the open-rate-vs-reply-rate trap here. The same principles apply to single-campaign evaluation.
How AI Personalization Moves the Reply Rate Math
The reason AI matters for cold email isn't volume. It's depth.
Templated cold email has been a saturated channel for a decade. Buyers ignore it because they recognize it instantly: generic opening line, predictable structure, a value prop that could apply to anyone. The 1-2% reply rate represents the small minority of recipients who happened to be in-market the day your generic message landed.
The historical alternative was a sales rep doing deep research on each prospect, pulling company news, recent funding announcements, hiring patterns, conference talks, and LinkedIn activity, then writing a 3-line opening that referenced something specific. That works. Reply rates of 6-10% on properly researched cold are normal. It just doesn't scale. A sales rep can do maybe 15-30 of these per day. To run 60 a week, or 600 a week, you needed a team.
AI removes the volume constraint on deep personalization. The same research-and-write loop that used to cost a sales rep 20 minutes per email now runs in parallel across hundreds of prospects at a fraction of a cent each. The economics flip from "deep personalization is too expensive to scale" to "deep personalization is the cheapest path to high reply rates."
We documented the broader math in How Many Leads Can AI Generate Per Month. The short version: personalized AI outbound at SMB scale typically produces 1.5x to 3x the reply rate of templated outbound, on the same list, with the same total sends. The compounding effect is meaningful over a quarter.
For the customer who cancelled, here's what the data actually said. His deliverability was healthy. His sample size was too small to read reply rate. The 0 reply rate he did see was statistically consistent with both a broken campaign and a healthy campaign that simply hadn't run long enough. He needed 200-500 more sends to know which.
He didn't have that volume of data, so he made a decision on noise. It happens. The lesson isn't that he made the wrong call. It's that the dashboard didn't make the data legible enough for him to make the right call.
B2B Cold Email Statistics 2026 — The Honest Summary
If you take one thing away from this post, take this table.
| Metric | Healthy range | Below this is broken | Above this is exceptional |
|---|---|---|---|
| Cold open rate (real, MPP-adjusted) | 15-30% | Below 10% | Above 40% |
| Cold reply rate (templated) | 1-2% | Below 0.5% on 500+ sends | Above 3% |
| Cold reply rate (AI-personalized) | 3-5% | Below 2% on 500+ sends | Above 7% |
| Cold reply rate (deep personalization) | 6-10% | Below 4% on 500+ sends | Above 12% |
| Sample size for reply-rate signal | 200-500+ sends | n/a | n/a |
| Bounce rate (healthy sender) | Below 2% | Above 5% | Below 0.5% |
This table cites Mailchimp's open-rate baselines, HubSpot's reply-rate ranges, Cognism's research on AI sales agents, and the consensus from the published case data of Lavender, Smartlead, and Lemlist. If a vendor publishes numbers materially outside these bands, ask for the methodology.
What This Means for Your Next Cold Email Campaign
If you're evaluating cold email, whether it's your own setup, a new vendor, or a free trial, three things to track:
- Open rate by 100 sends. Below 10%, fix deliverability before sending another email.
- Reply rate by 300 sends. Below 300 total sends, the number is mostly noise. After 300, it starts to mean something.
- The benchmark you should beat. Templated cold equals 1-2%. Reasonable AI personalization equals 3-5%. If you're at 5%+ on 300+ sends, you have a working campaign. Below 1% on 500+ sends, you have a fixable problem in body copy or targeting.
theKrew Starter at $99 a month is built for the SMB version of cold email — tighter ICP, deeper personalization, lower per-send volume, higher reply rate. It runs alongside the other six channels you'd otherwise be coordinating across separate vendors. The reply-rate range we typically see on Starter is 3-7% by month two, contingent on the customer running enough volume to clear the statistical noise threshold described above. The 15-day free trial is the cheapest way to test that assumption, but only if the trial runs long enough to clear noise. Sixty sends is not long enough.
Most importantly: stop reading 0 replies on 60 sends as a verdict. It's a coin flip on incomplete data. The right question for any cold email campaign at low volume isn't "is this broken?" It's "should I keep sending?" The answer is almost always yes, until you've crossed the threshold where the reply rate starts to mean something.
If the answer is no, the data should tell you so unambiguously. If the data isn't unambiguous, you haven't sent enough yet.
That's the cold email reply rate benchmark conversation in one paragraph.