Or: The Difference between Significance And Relevance
Normally, the average citizen and average scientist is satisfied when he hears that a research result was “statistically significant”. We then commonly mean: the hypothesis with which one approached the research has been proven, the fact one is investigating has been proven. And conversely, if no significant result is found, we believe that the phenomenon in question has not been found, i.e. it does not exist. This is why, for example, the average doctor, journalist and citizen believes that bioresonance is proven to be ineffective and homeopathy is placebo, and half of America takes lipid-lowering drugs for the primary prevention of heart attacks, because they believe this is a scientifically proven fact.
In this chapter I want to take a closer look at a few of these opinions and show why they have arisen and ask how justified they are. It will turn out: it has to do with what I call the magic of statistics. That is the question of how powerful a statistical test is. It’s related to the question of how big an effect we’re studying. And it depends on how big a sample we need to really make the effect statistically visible, or to get a significant result. In other words, if there is a systematic effect, no matter how big it is, then it can be proven with a study, provided we have enough resources.
The question that every reader of a scientific study should ask is not: Is a study significant? But rather: Is the effect shown, whether significant or not, clinically and systematically important? If it is significant, then we can assume scientific confirmation. If it is not significant, we have to ask ourselves: was the size of the study suitable to find the effect? Or vice versa: how large would a study have to be to statistically confirm an effect of the magnitude found with a reasonably satisfactory degree of certainty? That is the essence of the power analysis we are now dealing with.
So in any scientific investigation we are dealing with the interplay of a total of four variables that depend on each other like the parts of a delicate mobile. If we change one, all the others change too. These would be:
1. the error of the first kind, or the alpha error.
2. the error of the second kind, or the beta error.
3. the size of the effect, or the effect size.
4. the size of the study, or the number of subjects studied (in the case of clinical or diagnostic studies), or the number of observations.
Due to the size of this chapter, it is summarised in a PDF, please continue reading here: