Definition · Plain-language
Central limit theorem
The central limit theorem states that, as sample size increases, the sampling distribution of the sample mean approaches a normal distribution regardless of the shape of the population it was drawn from.
The step most authors miss
Doing CRediT right? Don’t stop at the statement.
A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.
Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.
The sampling distribution of the mean
Imagine drawing many random samples of the same size from a population and calculating the mean of each. Those sample means form their own distribution, called the sampling distribution of the mean. The central limit theorem describes its shape: as the sample size n increases, this distribution of sample means becomes approximately normal, even if the population itself is skewed, bimodal or otherwise non-normal. The sample means centre on the true population mean μ, and their spread — the standard error — equals σ / √n, where σ is the population standard deviation. Larger samples therefore give a tighter, more normal distribution of means.
Why population shape stops mattering
The striking part of the theorem is its independence from the population’s shape. Whether the original data are uniform, exponential, heavily skewed or lumpy, the distribution of the sample mean still tends towards normality as n grows. A common rule of thumb is that a sample size of around 30 or more is sufficient for the approximation to be good for many distributions, though more strongly skewed populations need larger samples. This is what allows researchers to use the normal distribution to reason about sample means even when they know little about the underlying population, provided observations are independent and identically distributed.
Why it matters for inference
The central limit theorem is the foundation of much of inferential statistics. Because the sampling distribution of the mean is approximately normal, researchers can build confidence intervals and conduct hypothesis tests about a population mean using the normal distribution (or the closely related t-distribution when σ is estimated). It explains why these procedures remain valid for reasonably large samples drawn from non-normal populations. Without the CLT, every analysis would depend on knowing the exact shape of the population. With it, the well-understood normal curve provides a reliable approximation, making the theorem one of the most important results in statistics.
Key facts
At a glance
- Definition: the sampling distribution of the mean approaches normal as n grows
- Applies to: the distribution of sample means, not the raw data themselves
- Centre: the sample means centre on the population mean μ
- Spread: the standard error of the mean equals σ / √n
- Rule of thumb: n ≥ 30 is often adequate for many distributions
- Why it matters: underpins confidence intervals and hypothesis tests for the mean
Common misconceptions
What people often get wrong
Often heard: The central limit theorem says any large dataset becomes normally distributed.
Actually: It does not. The raw data keep their original shape. It is the distribution of the sample mean across many samples that approaches normality as the sample size increases.
Often heard: The central limit theorem only works if the population is already normal.
Actually: The opposite is its power: it holds whatever the population shape — skewed, uniform or bimodal — provided the sample is large enough and observations are independent.
Often heard: A bigger sample makes the data themselves less variable.
Actually: Increasing n reduces the variability of the sample mean (the standard error shrinks as σ / √n), not the spread of the underlying data, which reflects the population.







