A simple question from a client at a recent research presentation. And I’m sure they expected a simple Yes/No answer. But (as you may guess), the right answer is a bit more involved.
While the theory behind significance testing is simple and elegant, textbook definitions tend to cloud the concept with opaque jargon: e.g. “the calculation of the acceptance/rejection region surrounding the null/alternative hypotheses.” Potential misunderstanding is furthered through a set of conventions that are often codified as default settings in statistical software packages. Most of us have been trained to look for “magic values” of α=.05 (95% Significance Level) or α=.01 (99% Significance Level) … and we deem anything else to be not significant.
But what does statistical significance really mean?
First the theoretical definition: A α=.05 means that the probability of erroneously rejecting the Null Hypothesis due to random error is 5%. This type of error, also known as a false positive, is a Type I error. Since there is a Type I error, you may correctly infer that there is also a Type II error (β-error) – which is the probability of accepting the Null Hypothesis when it should have been rejected (also known as a false negative). Both types of errors can be problematic and at a given sample size, reducing one type of error generally results in increasing the other type. By the way, the only way to decrease both types of errors is to increase the sample size – which is often not feasible or can be cost prohibitive.
Now let’s look at what a Type I and Type II error means from a decision maker’s point-of-view. Think of a Type I error as “an error of commission.” The decision maker concludes and acts on information, when in fact he shouldn’t have. And a Type II error is “an error of omission.” The decision maker concluded that the evidence wasn’t compelling enough to act … and did nothing.
Which error is worse?
It depends on the situation. If you are in pharma and deciding whether or not to introduce a new compound, you’d better control for Type I errors – you want to be 99.9999+% sure that you aren’t introducing something that will cause harm. And you’re willing to leave a potentially helpful (and profitable) drug in the lab until you’re sure. An error of omission (NOT introducing the new product) will generally be preferable to making an error of commission (introducing a harmful drug).
But if you are an Advertising Director deciding whether or not to launch a new campaign, you might want to balance the two types of errors differently. If you make a Type I error, you make an error of commission: you might invest $5MM in a new campaign, and not end up with much to show for it. The downside? You lost $5MM. But if you made a Type II error, an error of omission … you would NOT introduce the campaign when, in fact, you should have. And, the downside? It might be priceless. What if that $5MM campaign would work and generate $100MM in incremental sales?
So how do you balance the errors? How do you set the appropriate α-level? Don’t accept standard conventions — think deeply about the risks associated with each type of error in the context of the decision you are making. How confident do you need to be to take action? What’s really at risk? And what are the risks of NOT taking action – what are the opportunity costs associated with doing nothing? We suggest you explicitly incorporate this conversation into every project planning session. The following steps should help:
- First, assess the risk, here defined as the costs of erroneous action or the lost opportunity costs of non-action
- Then, design the project so that the error types are balanced in accordance with the placement of the risk. Since the risk is known, or at least better understood, decisions like whether or not to spend incremental dollars for a larger sample can then be evaluated in context. (Basically buying peace of mind, the price of which is likely related to the degree of risk)
- The project’s results are then evaluated more or less at face value, since it’s already been decided that the designed significance level is sufficient based on the contextual risk.
Following these steps will guide the decision parameters to fit the risk profile for the decision being made.
Leave a Reply