A/B Testing Your Emails for Better Results

A/B testing sends two variations of an email to small subsets of your list, measures which variation performs better, then sends the winner to the rest of your subscribers. Consistent A/B testing across subject lines, content, and design improves email revenue by 15% to 30% over time as each test reveals what resonates with your specific audience.

How Email A/B Testing Works

The process is straightforward. You create two versions of an email that are identical except for one element you want to test. Your email platform randomly sends version A to a subset of your list and version B to another equally sized subset. After a waiting period, typically 2 to 4 hours, the platform measures which version performed better on your chosen metric (open rate, click rate, or revenue) and sends the winning version to the remaining subscribers automatically.

Most platforms split the test audience as 15% to 20% per variation, with 60% to 70% held back for the winner. On a 10,000-subscriber list, that means 1,500 people receive version A, 1,500 receive version B, and the remaining 7,000 receive whichever version won. This approach maximizes the benefit because the majority of your list receives the optimized version.

Klaviyo, Mailchimp, and Omnisend all include built-in A/B testing for campaigns. Klaviyo also supports A/B testing within automation flows, which is particularly valuable for optimizing your abandoned cart and welcome series emails that run continuously.

What to Test (In Priority Order)

1. Subject Lines

Subject lines have the highest impact because they determine whether the email gets opened at all. Test these variations: short vs. long (under 30 characters vs. 40-50 characters), direct benefit vs. curiosity gap, with vs. without personalization (first name or product name), with vs. without urgency language, and question vs. statement format. Subject line tests produce clear results quickly because open rate data accumulates within the first 2 hours of sending.

2. Send Time

Test different days of the week and times of day. Common wisdom says Tuesday through Thursday mornings perform best, but your audience may differ. A store selling to working parents might find Sunday evening performs well because that is when they have quiet browsing time. Test in 2-hour blocks: 8 AM vs. 12 PM, 10 AM vs. 6 PM. Once you find a winning time range, refine further within that window. If your platform supports it, use send time optimization that picks individual optimal times per subscriber.

3. Call to Action

Test CTA button text ("Shop Now" vs. "See the Collection" vs. "Get Yours"), button color (your brand accent color vs. a contrasting color), button placement (above the fold vs. after the content), and number of CTAs (one primary vs. repeated at top and bottom). CTA tests directly impact click-through rate and revenue, making them high-leverage tests to run after subject line optimization.

4. Email Content and Layout

Test long-form emails (detailed product descriptions, multiple sections) vs. short-form (single image, brief text, one CTA). Test image-heavy vs. text-heavy layouts. Test product grid format (2 products vs. 4 vs. 6). Test whether including prices in the email body increases or decreases clicks. These tests take longer to produce significant results because they affect click and conversion behavior rather than opens.

5. Offers and Incentives

Test discount types: percentage off vs. dollar amount off (15% off vs. $10 off), free shipping vs. percentage discount, and discount amount thresholds (10% vs. 15% vs. 20%). These tests measure revenue and profit impact rather than just engagement, so let them run longer and measure total revenue plus average order value, not just conversion count.

Running a Proper Test

Change only one variable per test. If you change both the subject line and the email layout between versions, you cannot know which change caused the difference in results. Isolate one element, test it, apply the winner, then move to the next element. This sequential approach is slower but produces reliable insights you can trust.

Ensure adequate sample size. A test with 50 recipients per variation is not statistically reliable because random chance can easily explain any difference in results. Aim for at least 200 recipients per variation for subject line tests (where open rate is the metric) and at least 500 per variation for click and conversion tests (where the measured events are less frequent). If your list is smaller than 1,000 subscribers, track directional trends across 3 to 5 similar tests rather than relying on any single test.

Define your success metric before sending. For subject line tests, the metric is open rate. For content and layout tests, the metric is click-through rate. For offer tests, the metric should be revenue per email sent, not just conversion rate, because a bigger discount might convert more people but generate less total profit. Deciding the metric after seeing results introduces bias.

Let the test run long enough. For subject line tests, 2 to 4 hours is usually sufficient because most opens happen quickly. For click and revenue tests, wait 12 to 24 hours because some subscribers need time to click through and complete a purchase. Do not check results every 30 minutes and declare a winner prematurely. Set the winning criteria in your platform and let it decide automatically.

Testing in Automation Flows

A/B testing in automated flows like welcome series and abandoned cart sequences is more valuable than testing campaigns because flow emails run continuously. A campaign A/B test produces one data point. A flow A/B test produces data continuously, with each new subscriber or cart abandoner contributing to the test results.

In Klaviyo, you add an A/B test branch within a flow that randomly splits entering subscribers between two email variations. The test runs indefinitely, automatically sending the better-performing version to a larger percentage of recipients as data accumulates. This means your highest-volume flows are always self-optimizing.

Focus flow testing on the elements with the highest revenue impact: abandoned cart email 1 subject line (the highest-volume flow email), welcome email 1 subject line (sets the tone for the series), and the discount amount in your win-back email 2 (directly affects recovered revenue). One optimized subject line on your abandoned cart email 1 can increase recovered revenue by 10% to 20% permanently because that email sends to every cart abandoner, every day.

Tracking and Applying Results

Keep a simple log of every test: date, what was tested, version A description, version B description, sample size per variation, winning version, margin of victory, and key takeaway. After 20 to 30 tests, patterns emerge that form your brand's email playbook. You might discover that your audience prefers short subject lines, opens more on Wednesday mornings, clicks more on lifestyle images than product shots, and converts better with free shipping than percentage discounts.

Apply winning insights broadly. If curiosity subject lines consistently outperform direct benefit lines in your tests, use curiosity-style subjects as your default for all campaigns and test within that style (one curiosity approach vs. another) rather than retesting the already-decided category.

Revisit past conclusions every 6 months. Audiences change, seasonal patterns shift, and what worked in winter may not work in summer. A test you ran in January is not necessarily valid in July. Cycle through your testing priorities periodically to confirm that your playbook still reflects your current audience's preferences.