At 383, we help clients to create, test and validate new customer propositions and ideas. As part of the testing and validation process, we’ll often create landing pages, or surveys to test one or more of these propositions. One of the biggest challenges with this is quickly testing a hypothesis that one proposition is ‘better’ than another, with only a small amount of sample users.
Typically, there is a trade-off between the following:
- How long a test needs to run in order to gather the appropriate sample size
- What the projected uplift is (in terms of revenue or another KPI such as lead acquisition)
- How much traffic the client can send to a website or app
The above variables are then compounded by the fact that, you might actually be testing a new proposition, or a whole slew of propositions. Understandably, there can be some nervousness within a business to go all in on an idea that hasn’t been verified from a quantitative perspective, even if you have done some initial qualitative research. This can then mean you only receive a small amount of traffic, which means a longer grind to reach the ideal sample size.
An alternative approach
As a partner to our clients, we want to be able to rapidly validate customer propositions with A/B testing, and as such, we needed an alternative to the typical frequentist approach. Handily, there is one; it’s called the Bayesian statistics and it employs Bayes Theorem.
In mathematical notation, Bayes Theorem is as follows:
If mathematical notation isn’t your thing, Bayes Theorem can also be defined as:
“a theorem describing how the conditional probability of each of a set of possible causes for a given observed outcome can be computed from knowledge of the probability of each cause and the conditional probability of the outcome of each cause.”
This means we can present our findings to our clients more intuitively, in terms of probabilities and make a business decision, based on probabilities.
For Bayesians, probability is treated as a “measure of belief” about a future events. Frequentists on the other hand use probability to refer to past events. This is very much Bayesians vs Frequentists in a nutshell, and for a deeper explanation, I’d encourage you to read this excellent blog post on Bayesian Machine Learning.
With Bayesian A/B testing, instead of working with p-values, we can instead take our test data as is and find the probability that Test > Control or vice versa. Given how little processing power is needed nowadays for Bayesian simulation, we are able to quickly reach a probability with open source tools such as R.
We aren’t the first company in the world to utilise this method; far from it. In 2014, Lyst announced that they were switching to Bayesian A/B testing, noting:
We prefer Bayesian methods for two main reasons. Firstly, our end result is a probability distribution, rather than a point estimate. Instead of having to think in terms of p-values, we can think directly in terms of the distribution of possible effects of our treatment. For example, if only 2% of the values of the posterior distribution lie below 0.05, we have 98% confidence that the conversion rate is above 0.05
For us, the switch to Bayesian A/B testing means we can inform clients relatively quickly of the probability of one proposition being ‘better’ than another. We still of course require samples, but we’re less worried about having to hit 10,000+ visits in order to reach statistical significance (let alone practical significance).
If you’re interested in how the worlds most innovative companies test and iterate at scale, you should attend Canvas, our annual product conference held each October. In the meantime, take a look at two videos from Canvas 2016 from Spotify and Intercom.
Feel free to get in contact with us to see how we could help your organisation rapidly test, validate and learn customer propositions.
Image: Maths in neon at Autonomy in Cambridge (Author MattBuck).