One Weird Trick for Improved A/B Testing

A/B testing has become ubiquitous in the digital world. It’s often touted as a scientific and unbiased way to design and verify that a digital touchpoint is as good as it could be.

But if you’ve ever been part of any sort of A/B testing, you know that it doesn’t always work out that way. Testing often starts with great enthusiasm, only to quickly burn out when tests don’t open the conversion flood gates. Or you run out of colors to test on your buttons. Or even worse, and far more difficult to detect, testing only ends up reinforcing existing biases.

The main reason for this confusion is a lack of understanding of where A/B testing sits in the cycle of continuous optimization. The simplicity of A/B testing software is a double edged sword: it’s easy to run a test and even easier to run a test incorrectly.

So what should you do?

Thankfully, there already exists an effective iterative optimization framework from the world of Six Sigma called DMAIC.

It stands for Define, Measure, Analyze, Improve, Control, and is a cyclical process by which opportunities for improvement can be identified and subsequently utilized.

Rhythm has taken this as a model and tailored it to website optimization.

Define

The heart and soul of the Define phase is setting your optimization goals.

Without a goal there can be no strategic testing, only tactical one-off tests. Without a testing strategy, your optimization efforts can quickly run out of gas. You’ll likely be left wondering what else you can test, relegating you to scrounge for whatever changes perception bias leads you to (or at least gets you points with the HiPPO in the room).

A well-defined goal lends itself to several strategies that can be used to achieve it. Each strategy leads to multiple hypothesis, which in turn lead to multiple testing opportunities.

While the SMART acronym (Specific, Measurable, Achievable, Relevant, Timely) provides a solid guideline for setting pretty much any goal, its original purpose was to help aid management goals. So, we’ve modified it a bit to be more specific to the purposes of web optimization.

It’s likely that in this process you’ll come up with multiple goals. That’s great! Just remember to focus on one goal at a time.

A good optimization goal is…

Specific – It precisely defines what you want to change, add or remove for testing (copy, layout, navigation, etc.) and which segment you want to test (paid visitors, new visitors, all visitors, etc.)
Measurable – This is your KPI, most often some sort of conversion rate. If your site has multiple conversion points, make sure that each goal only specifies one.
Achievable – Is it feasible to modify the target site elements via testing?
Relevant – Is there a correlation between what you’re changing and the KPI? For example, is there are difference in the performance of the KPI when visitors view the pages you’re testing compared to when they don’t?
Timely - Can your tests reach conclusive results in a reasonable amount of time? This is based off of the current conversion rate and how much improvement you want to minimally detect.

Once you know approximately how many visitors you’ll need to conclude your tests, you can then estimate how long the test will run based on your site’s average visit volume.

Too long? Try targeting a sizable segment with a better conversion rate.

When first starting down the path of an optimization process, you may find it difficult to create a goal that fits all of these criteria. Fear not! Remember that this is an iterative process, so going through the Measure and Analyze phases and then returning back to the Define phase is a good way of creating the SMARTest goal you desire.

For example:

Dumb Goal – Improve the website’s homepage.

SMARTer Goal – Improve the website’s homepage so that the conversion rate goes up.

SMARTest Goal – Update the homepage’s featured content so that new visitors that land there submit the contact form at a higher rate than the current baseline for these same types of users.

Measure

Audit what data you currently have, and if it’s been a while, make sure the data is being collected as intended. You may also find that new data are needed that are not currently available. In this phase baselines can be calculated.

If you haven’t done so already, verify that that your goal is Relevant and Timely.

Analyze

This is the research phase.

The outcome of this phase is the hypothesis that will be tested in the next phase. Remember that a hypothesis comes from data and research, while an idea just comes from someone’s head.

That’s not to say that ideas are bad. In fact, every hypothesis begins its life as an idea. But an idea just isn’t enough on its own. So, take your ideas and validate them with the data you have to create a hypothesis. You may even find you are missing some data and need to return to the Measure phase.

Since your goal defines what you will be modifying, the hypothesis then describes how you will modify it to achieve your goal. The best hypothesis will give rise to multiple tests and have application beyond what you’re directly testing.

A good hypothesis usually follows this framework: “Since [some empirical observation] we believe [some sort of implication], therefore if we [make a change reflecting this implication] then we expect [some impact will happen].”

For example:

Not So Good Hypothesis – We think that users will click the red button more often.

Good Hypothesis – Since the users that view the services pages are more likely to convert, we believe that this is what motivates them to act, therefore if we make services content more prominent on the homepage then we expect the conversion rate to increase for new users landing there.

Improve

Here is where you finally run your test.

If you followed the guidelines for SMART goals, then you should know how many visitors you need per variation. If not, figure it out now. This is perhaps the most common detrimental mistake that people make when performing A/B testing. You can read a ton about it here.

Remember to use only one KPI per test. Each KPI you add increases the likelihood of a false positive result.

Control

Regardless of the result of your hypothesis test, you’ll have gained some sort of knowledge.

An inconclusive result means that your hypothesis may be focusing on elements that don’t affect the KPI, so start looking elsewhere.

A disproved hypothesis shows that you are likely modifying the correct elements but haven’t changed them to positively impact conversions.

And of course, the holy grail of testing—a result that supports your hypothesis means that you should make those changes live! But don’t stop there. Continue to monitor the performance once the changes are live. Does the expected lift remain? It is possible that the test results were a false positive.

From here, return back to the Define phase and begin again. You may even have good enough goals and hypothesis that you can jump back to the Improve phase and initiate another test. Or maybe the results of your test brought new light to the data you have been looking at and a new series of tests may begin.