Get the best of A/B Testing in email marketing

Microsoft Corporation

Nov, 2014


This whitepaper helps you briefly understand the concept of A/B Testing, work with the flexibility and configurations offered by Microsoft Dynamics Marketing, and highlights best practices to adopt so you can get good results out of A/B Testing your email marketing.

Applies To

Microsoft Dynamics Marketing 2015 Update and higher


A/B Testing is a simple way of carrying out hypothesis testing on marketing communications. With Microsoft Dynamics Marketing 2015 Update, we have introduced A/B Testing for email marketing messages.

Using this feature, a marketer creates two slightly different versions of the same promotional email and set ‘winning criteria’ (for example, number of Opens, UniqueClicks etc.). Those emails are sent to a test group. The test group can also be part of your original target population (say, 10% of them). Half of the test group receives ‘version A’ and the other half ‘version B’.

After enough time has elapsed (say, 3 days), the ‘winner version’ is decided as per the number of responses provided to both versions of promotional email and per the initially established winning criteria. .

The ‘winner version’ is sent as the final email to the remaining target population. You can interrupt and/or override the above automation process at any time. For instance, you can override the automated selection of the ‘winner version’, or can cancel an ongoing A/B Test. If the ‘winner version’ is yet to be sent out, you can change the send time of the ‘winner version’.

Terms definitions

Term Definition


The two email versions being tested as part of the A/B Testing activity are referred to as ‘versions’ or alternatively, ‘version A’ and ’version B’.

Test group

The subsets of email recipients that receive the email versions as part of the testing phase are called ‘Test group’.

Winner declaration time

The end of testing phase when the responses received by the two email versions are judged to decide a winner version is called ‘Winner declaration time’.

Winning criteria

The criterion (or criteria) that are used to decide the winner version.

Winner version

The email version that turns out to be better based on the response of the test group, at the end of the testing phase.


Microsoft Dynamics Marketing does not control whether your A/B test results are statistically significant; it simply declares a winner based on the conditions you define for the test. (Statistical significance is a mathematical technique for confirming the likelihood that an experimental result was produced by a causal effect, rather than just a chance outcome. In general, a constant and wide difference produced by a large sample set is more likely to be statistically significant than a narrow difference found in a small, widely varying sample.)

A flexible platform for A/B Testing over emails

Microsoft Dynamics Marketing 2015 Update allows you to be very flexible about A/B Testing emails. Here are variations that can be tested across the email versions:

  • Which subject lines draw more responses

  • What content or content layout is more appreciated

  • Which ‘from name’ in the email is considered more credible

  • What the optimal email send time is for the target population

One or more of the variations listed above can be applied to the two email versions. Microsoft Dynamics Marketing 2015 Update lets you pick one/more (weighted average) of the following email responses, as the ‘winning criteria’:

  • Number of opens

  • Number of total clicks (subject to the links present in the email body)

  • Number of unique clicks (subject to the links present in the email body)

  • Number of hard bounces

  • Number of soft bounces

  • Number of forwards

  • Number of unsubscribes

Positive actions (such as open or click) increase the associated variant’s score; while negative actions (such as unsubscribe) decrease the score.

Microsoft Dynamics Marketing 2015 Update also lets you specify the ‘test group’ in a couple of ways:

  • Specify a percentage split of the actual targeted email recipients

  • Specify a separate marketing list altogether, as the ‘test group’

Word of caution while playing with the flexible A/B Testing feature

While the true spirit of A/B Testing mandates that the amount of variation between the two versions should be minimal, this feature lets you configure several differences between the two versions. However, if you do so, when the results show up, they don’t impart any learning as to which variations led to better results.

Best practices for A/B Testing over marketing emails

One variation at a time

If you try to vary several parameters in the same A/B pair, you’d perhaps get a better email version among the two. But you will not be able to know which of those parameters was actually responsible for the difference of results and to which extent.

For instance, the marketer creates different layouts for version A and version B, and specifies different subject lines for the two versions. After the A/B Test is over, version A turns out to be the better version. However, the marketer cannot confidently conclude whether the difference in response was caused by the layout or the subject line.

A/B Testing as a practice should be treated as a learning exercise, where the marketer gathers different tactics of increasing email effectiveness across different segments over time.

Bigger test groups

A/B Testing is a statistical concept. In Microsoft Dynamics Marketing 2015 Update, the difference between the results obtained by the two versions is simply arithmetically compared. For a more statistically significant result (with higher confidence interval), the principles of statistics states that it is essential to use bigger sample size.

Bigger gap between the results

In Microsoft Dynamics Marketing 2015 Update, the difference between the results obtained by the two versions is simply arithmetically compared. Note that to get a more statistically significant result (with larger confidence interval), the difference between the result numbers should be larger.

For instance, the open rates observed in one case are 20% (version A) and 21% (version B). And in other case, the open rates were 2% (version A) and 21% (version B). In both cases, version B turns out to be better. But the result in the second case is more statistically significant (or reliable).

Sufficient testing time

Typically the email recipients should be allowed a period of 48-72 hours to respond to the emails. This number is representational and can vary across industries and across segments.

When you are testing the send-time between the two versions of the email, allow sufficient time after the latter versions has been sent. The prior version is going to get an upper hand, but that effect can be subsided by providing the sufficient testing time. For example, if the marketer wants to test the send time variation, she specifies Monday morning (for version A) and Friday evening (for version B) as send times. The test phase must continue for 4-5 days after the latter version (Friday evening) is sent out. This would allow a fair chance for the recipients of that version to react to the emails, generating more credible A/B Test results.

Send comments about this topic to Microsoft.
© 2015 Microsoft. All rights reserved.