The normal distribution, also called the Gaussian distribution, is a continuous probability distribution that is symmetric about its mean and forms a bell-shaped curve. It is fully described by two parameters: the mean, which locates the centre of the curve, and the standard deviation, which controls its width. Most values lie near the mean, and values become increasingly rare as they move further away in either direction.
Shape, symmetry and parameters
A normal curve is perfectly symmetric, so its mean, median and mode coincide at the centre. The two tails extend infinitely in both directions, approaching but never touching the horizontal axis. Changing the mean shifts the curve left or right; changing the standard deviation stretches or compresses it. A larger standard deviation produces a flatter, wider bell; a smaller one produces a taller, narrower peak.
The 68-95-99.7 rule
For any normal distribution, a fixed proportion of values falls within a given number of standard deviations of the mean. This is known as the empirical rule, or the 68-95-99.7 rule.
| Within | Approximate proportion |
|---|---|
| ±1 standard deviation | 68% |
| ±2 standard deviations | 95% |
| ±3 standard deviations | 99.7% |
This rule underpins the interpretation of confidence intervals and the identification of outliers, since values beyond about three standard deviations are unusual under normality.
The central limit theorem
The normal distribution is central to statistics largely because of the central limit theorem. This theorem states that the sampling distribution of the mean of a sufficiently large number of independent observations is approximately normal, regardless of the shape of the underlying population, provided the population has a finite variance. In practice, sample means tend towards normality as sample size increases, often by around n = 30 for moderately skewed data. This is why many tests that compare means, such as the t-test, can be applied even when the raw data are not perfectly normal.
Why it matters for inference
Because the behaviour of the normal distribution is exactly known, it provides the mathematical basis for many inferential procedures, including the calculation of p-values and significance tests. Standardising a value into a z-score, by subtracting the mean and dividing by the standard deviation, lets researchers compare observations on a common scale and look up exact probabilities.
What is and is not normally distributed
Many measurements approximate a normal distribution, including heights, blood pressure and measurement errors. However, normality should never be assumed. Reaction times, incomes and counts of rare events are typically skewed, and some variables are bounded or bimodal. Always check the distribution using histograms or quantile-quantile plots before applying methods that assume normality. Defining variables and their distributions clearly supports the reproducibility standards set out in the CASRAI dictionary and our guidance for authors.
Frequently asked questions
What is the difference between the normal and standard normal distribution?
The standard normal distribution is a special case with a mean of 0 and a standard deviation of 1. Any normal distribution can be converted to the standard normal by calculating z-scores.
Does my data have to be normal to use statistics?
Not always. Thanks to the central limit theorem, tests based on means are robust to non-normality at larger sample sizes. For small samples or strongly skewed data, non-parametric alternatives or transformations may be more appropriate.
How can I check whether data are normally distributed?
Use graphical tools such as histograms and quantile-quantile plots, supplemented by formal tests like Shapiro-Wilk. Visual inspection is often the most informative, as formal tests can be over-sensitive in large samples.







