Editorial · CASRAI · Reproducibility and computational research

Sample Size and Statistical Power Explained

Reproducibility and computational research

Statistical power is the probability that a study will detect a real effect. This explainer covers Type I and Type II errors, the power = 1 minus beta relationship, the 0.8 convention, a priori power analysis, and why underpowered studies harm reproducibility.

ByCASRAI Editorial Board

Published 19 Jun 2026· 3 minute read

Statistical power is the probability that a study will correctly detect an effect when one truly exists. It is formally defined as one minus the Type II error rate, written as power = 1 − β. A study with high power is likely to find a real effect; an underpowered study may miss it, producing a false negative. Power is closely tied to sample size, which is why power analysis is a core part of study planning.

Type I and Type II errors

Hypothesis testing can go wrong in two ways. A Type I error, with probability α, occurs when the test detects an effect that is not really there, a false positive. A Type II error, with probability β, occurs when the test fails to detect an effect that is genuinely present, a false negative.

	Effect truly exists	No effect exists
Test is significant	Correct (power = 1 − β)	Type I error (α)
Test is not significant	Type II error (β)	Correct

The significance threshold α is usually set at 0.05, which links directly to the interpretation of p-values and significance testing.

The 0.8 convention

By widespread convention, researchers aim for a power of at least 0.8, meaning the study has an 80% chance of detecting the effect of interest if it exists. This corresponds to a Type II error rate of 0.2. The figure is a pragmatic standard rather than a law: some fields demand higher power, such as 0.9, particularly when missing an effect would be costly. The key point is to choose and justify a target before data collection.

What determines power?

Four quantities are linked: the sample size, the effect size, the significance level α and the power. Fixing any three determines the fourth. Power increases with a larger sample size, a larger true effect, a less stringent α and lower data variance. Because researchers usually cannot change the effect size or the desired α, the practical lever is the sample size.

A priori power analysis

An a priori power analysis is performed before data collection to determine the sample size needed to achieve the desired power for a plausible effect size. Researchers specify the target power (often 0.8), the significance level (often 0.05) and the smallest effect size they consider meaningful, then calculate the required number of participants. This prevents the common mistake of recruiting too few subjects, and is increasingly expected by funders, ethics committees and journals. The same logic applies whether the planned analysis is a t-test, a regression or another test.

Why underpowered studies harm reproducibility

Underpowered studies are a major threat to reproducibility. They frequently miss real effects, and when they do reach significance the estimated effect is often exaggerated, a phenomenon known as the winner’s curse. Such inflated estimates fail to replicate in larger studies. Conducting and reporting a power analysis, and pre-specifying the sample size, makes research more credible. The CASRAI dictionary and our author guidance encourage transparent reporting of these design choices, ideally alongside a confidence interval that conveys the precision of the estimate.

Frequently asked questions

What is a good level of statistical power?

A power of 0.8 is the common minimum, giving an 80% chance of detecting a true effect. Higher targets such as 0.9 are preferable when feasible, especially for confirmatory studies.

Can I calculate power after the study is finished?

Post-hoc power calculated from the observed effect is generally uninformative, because it is just a restatement of the p-value. Power analysis is most useful when done in advance to plan sample size.

What is the relationship between sample size and power?

Larger samples increase power because they reduce the standard error, making real effects easier to detect. This is the main reason a priori power analysis focuses on choosing an adequate sample size.

Related editorial in this domain

More on Reproducibility and computational research

20 Jun 2026

Reporting Molecular Methods: PCR, qPCR and the MIQE Guidelines

PCR and quantitative PCR are core molecular methods, and the MIQE guidelines define what must be reported for results to be reproducible. This guide explains PCR at a high level and the minimum information MIQE requires for transparent qPCR experiments.

20 Jun 2026

Outliers in Statistics: Definition, Detection and Principled Handling

An outlier is a data point that lies an unusual distance from the bulk of a distribution. This guide defines outliers, separates measurement error from genuine extremes, and sets out detection methods and principled handling that you report rather than delete silently.

20 Jun 2026

PRISMA: The 2020 Reporting Standard for Systematic Reviews and Meta-Analyses

PRISMA is the Preferred Reporting Items for Systematic reviews and Meta-Analyses, a reporting standard whose 2020 update supplies a 27-item checklist and a flow diagram so that reviews are transparent, complete and reproducible by other researchers.