I use A/B split tests on my products a lot. Sometimes I’ll test which headline works best, maybe design changes, or which ad copy has the best response.
At any given moment I probably have couple dozen split tests running at once, even with low traffic.
I use a variety of tools to run the split tests. Some like Optimizely do all of the calculations for you, so all you have to do is to wait for the results.
But other tools like Adwords or my internal tools don’t handle the calculations for me. This is common with basic traffic splitters, where 50% of the visitors go to one page and 50% go to another.
Fast feedback is critical
Knowing how a split test is doing is important. Declaring a winner too early can lead you down a false path and over time it could end up reducing your conversion rate. On the other hand, running a split test too long means you lose out on the opportunity to run a second split test (and gain the exponential effect from constantly testing).
To strike a balance with split testing, a confidence calculator comes handy. Basically it takes the results you’ve seen so far and calculates if the results are statistically significant or if they are just a fluke.
Existing split test calculators
When I got started I used several of the existing split test calculators but they were:
- too simple and just said Yes/No
- too complex with multiple data fields for each variation
- too slow
The latter problem was the one that affected me the most. Every week I check all my split tests twice (Monday and Thursday) and with dozens split tests running, much of my time was waiting for the calculators.
Since I’m working on more marketing projects for clients, I decided it would be fun and useful to see if I could build a better A/B split test calculator. One that I’d use and other people might find useful.
And at the same time I’d get to dig into the math behind split tests and better understand how statistical significance worked in regards to A/B tests.
A/B Split Test Calculator, Version 1
Now I’ve completed the first version of the calculator. I’ve been using it since March 2014 and comparing the results against the other calculators online (though it seems that everyone uses different equations).
The best part about it: it’s fast. Really fast.
I’d like to add at least one more feature to it when I get time. If it could forecast that a split test needs X more impressions before it’s significant, then it would make it easier to decide when to cancel a test much easier. (Canceling weak tests that are only showing a small improvement helps when you have low traffic.)