Short steps on how to read a paper: Part 12: Probability
In this short step, I hope to explain probability or p values. I wonder if readers of papers do not always understand the meaning of p-values? This information is not a complex area of an article. But I wonder if authors and statisticians unintentionally make it complicated. So here is my attempt at explaining p values.
Let’s look at probability.
We all know that if we toss a balanced coin 100 times, there is a fair chance that we will get heads and tails 50 times each. This means that the probability of heads or tails is 50:50.
P values in a scientific paper represent the chances of a particular result. In the coin-tossing exercise, we can say that p=0.5.
When we look at a scientific paper, there is always a probability that a difference between interventions has occurred entirely by chance. We, therefore, have to decide on the level of probability we can accept.
Statisticians have decided that the level of risk that we should accept that a result has occurred by chance is 5% or p=0.05. This figure is often called statistical significance. Therefore, if we are reading a paper and the authors report that the difference is statistically significant at p<0.05, then we are taking a 5% risk that any difference has occurred by chance.
When p<0.01, the risk we are taking is 1%.
It is as simple as that.
Now to complicate things!
All this is very well. But what happens if p<0.07. It is not statistically significant. But the risk is still low at 7%. In this case, as readers, it is up to us to decide if this is sufficient risk for us to accept that there is a difference. Therein, lies a big problem with P values- the relatively arbitrary nature of the cut-off. This is why statisticians like to see P values presented with associated effect sizes and confidence intervals. The latter helps us evaluate the size and potential relevance of any difference and the precision of that estimate of effect.
It is also crucial to realise that if a result is p=<0.01, it does not mean that it is any more important than p<0.05. Authors tend to get excited about low p values. But they only mean that the probability of a result occurring by chance is lower.
I hope that this explanation is clear, if not, ask in the comments section.
Next week I am going to look at effect size.
Emeritus Professor of Orthodontics, University of Manchester, UK.
Very well explained you made dreaded p values look so simple
Thank you Dr O’Brien
Thanks Kevin for another installment, these are all valuable.
P values carry a lot of baggage by a lot of people. People and statisticians have overemphasised and misinterpreted P values for decades and that some journals now refuse to publish articles with them is understandable. A P value cannot prove a method from a small single experiment yet our journals are filled with these ‘facts’.
Well researched and respected articles such as https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124 have brought light to the topic of P values and even the American Statistical Association statement on P values admits fundamental flaws in relying on them. There is an interesting followup editorial by the same journal https://www.tandfonline.com/doi/full/10.1080/00031305.2019.1583913 this editorial suggests that while difficult we must move beyond 0.05.
Effect size, replication, prepublication all have much valuable information that is ignored by the misuse of the word significance.
Unfortunately, when I see the words significance in an abstract as proof that something important was found I read the article with distrust which, sadly, is often verified.
I suggest reading Geoff Cumming’s book the New statistics and using the you tube videos that go with it. However, Geoff has a you tube video on P values that is informative and well worth watching. https://www.youtube.com/watch?v=5OL1RqHrZQ8
NB You express a P value as a probability, my reading of statistics suggest this is not correct.
I also found a reasonable explanation here: https://blog.minitab.com/en/adventures-in-statistics-2/how-to-correctly-interpret-p-values (section on P Values Are NOT the Probability of Making a Mistake).