Skip to main content
v2026.1714 entries · CC-BY 4.0
CASRAI

Editorial · CASRAI · Reproducibility and computational research

P-Values and Statistical Significance Explained Correctly

A p-value is the probability of observing data at least as extreme as those seen, assuming the null hypothesis is true. This guide explains p-values precisely, summarises the ASA 2016 statement, and corrects common misinterpretations.

ByCASRAI Editorial Board
Published 19 Jun 2026· 4 minute read

A p-value is the probability of obtaining a result at least as extreme as the one observed, assuming that the null hypothesis is true. It is a measure of how compatible the data are with a specified statistical model in which there is no effect or no difference. A small p-value indicates that the observed data would be unusual if the null hypothesis held; it does not, by itself, prove that the null hypothesis is false or that an effect is real or important.

What the null hypothesis represents

Hypothesis testing begins with a null hypothesis, typically a statement of no effect, no difference or no association. The test asks how surprising the observed data would be if that null hypothesis were true. The p-value quantifies that surprise: the smaller it is, the less compatible the data are with the null model. Critically, the p-value is calculated under the assumption that the null is true, which is why it cannot be read as the probability that the null is true.

The American Statistical Association’s 2016 statement

In 2016 the American Statistical Association (ASA) published a formal statement on p-values, the first time it had issued such guidance, in response to widespread misuse. The statement set out six principles. In summary, it affirmed that p-values can indicate how incompatible data are with a specified model, but warned that a p-value does not measure the probability that the hypothesis under study is true, nor the probability that the data arose by chance alone. It cautioned that scientific conclusions should not be based only on whether a p-value passes a threshold, that proper reporting requires full transparency, that a p-value does not measure the size or importance of an effect, and that by itself a p-value is a poor measure of evidence regarding a model or hypothesis.

Common misinterpretations

Several persistent errors surround p-values. Avoiding them is essential for sound, reproducible reporting.

Misinterpretation Why it is wrong
The p-value is the probability the null hypothesis is true It is calculated assuming the null is true; it cannot also be that probability
p = 0.05 means a 5% chance the result is a fluke The p-value is not the probability that the finding is due to chance
A non-significant result proves no effect exists Absence of significance is not evidence of absence; the study may simply lack power
A small p-value means a large or important effect The p-value reflects compatibility and sample size, not effect magnitude

The limits of the 0.05 convention

The threshold of 0.05 for declaring statistical significance is a convention, not a law of nature. Treating 0.05 as a bright line encourages dichotomous thinking in which a result at p = 0.049 is celebrated and one at p = 0.051 dismissed, despite negligible difference between them. This convention has fed practices such as selective reporting and p-hacking, where analyses are adjusted until a result crosses the threshold, both serious threats to reproducibility. The ASA statement explicitly warned against basing conclusions solely on whether a p-value clears a cut-off.

Effect sizes and intervals

Because a p-value says nothing about magnitude, it should be accompanied by an effect size, which describes how large the observed effect is, and ideally a confidence interval, which expresses the precision of the estimate. Reporting these alongside, or instead of, a bare p-value gives readers far more information for judging whether a finding matters. The underpinning ideas come from the wider discipline of statistics, and transparent reporting of all of them supports the goals tracked in our reproducibility category. For terminology and reporting conventions, consult the CASRAI dictionary.

Frequently asked questions

Does a p-value below 0.05 prove an effect is real?

No. It indicates the data would be unusual if the null hypothesis were true, but it does not prove the null is false, nor that the effect is large or important. Replication, effect sizes and intervals are needed to judge that.

What did the ASA 2016 statement conclude?

The statement set out six principles emphasising that p-values measure compatibility with a model, are not the probability the hypothesis is true, do not measure effect size, and should never be the sole basis for scientific conclusions. It urged full transparency in reporting.

Should we abandon p-values altogether?

Not necessarily. P-values can be informative when interpreted correctly and reported alongside effect sizes and confidence intervals. The problem lies in misuse and over-reliance on a single threshold, not in the statistic itself. See the CASRAI author guidance for reporting practices.

Referenced across the research world

University of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logoUniversity of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logo
  • University of Cambridge logo
  • Columbia University logo
  • University of Edinburgh logo
  • Harvard University logo
  • University of Oxford logo
  • Princeton University logo
  • Stanford School of Medicine logo
  • University College London logo
  • ORCID logo
  • Crossref logo

View CASRAI adoption →