Skip to main content
v2026.1714 entries · CC-BY 4.0
CASRAI

Editorial · CASRAI · Reproducibility and computational research

Sample Size and Statistical Power Explained

Statistical power is the probability that a study will detect a real effect. This explainer covers Type I and Type II errors, the power = 1 minus beta relationship, the 0.8 convention, a priori power analysis, and why underpowered studies harm reproducibility.

ByCASRAI Editorial Board
Published 19 Jun 2026· 3 minute read

Statistical power is the probability that a study will correctly detect an effect when one truly exists. It is formally defined as one minus the Type II error rate, written as power = 1 − β. A study with high power is likely to find a real effect; an underpowered study may miss it, producing a false negative. Power is closely tied to sample size, which is why power analysis is a core part of study planning.

Type I and Type II errors

Hypothesis testing can go wrong in two ways. A Type I error, with probability α, occurs when the test detects an effect that is not really there, a false positive. A Type II error, with probability β, occurs when the test fails to detect an effect that is genuinely present, a false negative.

Effect truly exists No effect exists
Test is significant Correct (power = 1 − β) Type I error (α)
Test is not significant Type II error (β) Correct

The significance threshold α is usually set at 0.05, which links directly to the interpretation of p-values and significance testing.

The 0.8 convention

By widespread convention, researchers aim for a power of at least 0.8, meaning the study has an 80% chance of detecting the effect of interest if it exists. This corresponds to a Type II error rate of 0.2. The figure is a pragmatic standard rather than a law: some fields demand higher power, such as 0.9, particularly when missing an effect would be costly. The key point is to choose and justify a target before data collection.

What determines power?

Four quantities are linked: the sample size, the effect size, the significance level α and the power. Fixing any three determines the fourth. Power increases with a larger sample size, a larger true effect, a less stringent α and lower data variance. Because researchers usually cannot change the effect size or the desired α, the practical lever is the sample size.

A priori power analysis

An a priori power analysis is performed before data collection to determine the sample size needed to achieve the desired power for a plausible effect size. Researchers specify the target power (often 0.8), the significance level (often 0.05) and the smallest effect size they consider meaningful, then calculate the required number of participants. This prevents the common mistake of recruiting too few subjects, and is increasingly expected by funders, ethics committees and journals. The same logic applies whether the planned analysis is a t-test, a regression or another test.

Why underpowered studies harm reproducibility

Underpowered studies are a major threat to reproducibility. They frequently miss real effects, and when they do reach significance the estimated effect is often exaggerated, a phenomenon known as the winner’s curse. Such inflated estimates fail to replicate in larger studies. Conducting and reporting a power analysis, and pre-specifying the sample size, makes research more credible. The CASRAI dictionary and our author guidance encourage transparent reporting of these design choices, ideally alongside a confidence interval that conveys the precision of the estimate.

Frequently asked questions

What is a good level of statistical power?

A power of 0.8 is the common minimum, giving an 80% chance of detecting a true effect. Higher targets such as 0.9 are preferable when feasible, especially for confirmatory studies.

Can I calculate power after the study is finished?

Post-hoc power calculated from the observed effect is generally uninformative, because it is just a restatement of the p-value. Power analysis is most useful when done in advance to plan sample size.

What is the relationship between sample size and power?

Larger samples increase power because they reduce the standard error, making real effects easier to detect. This is the main reason a priori power analysis focuses on choosing an adequate sample size.

Referenced across the research world

University of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logoUniversity of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logo
  • University of Cambridge logo
  • Columbia University logo
  • University of Edinburgh logo
  • Harvard University logo
  • University of Oxford logo
  • Princeton University logo
  • Stanford School of Medicine logo
  • University College London logo
  • ORCID logo
  • Crossref logo

View CASRAI adoption →