Editorial · CASRAI · Reproducibility and computational research

Outliers in Statistics: Definition, Detection and Principled Handling

Reproducibility and computational research

An outlier is a data point that lies an unusual distance from the bulk of a distribution. This guide defines outliers, separates measurement error from genuine extremes, and sets out detection methods and principled handling that you report rather than delete silently.

ByCASRAI Editorial Board

Published 20 Jun 2026· 4 minute read

An outlier is an observation that lies a markedly unusual distance from the rest of a dataset — far enough that it may distort summary statistics, model fit or test results. Outliers are not automatically errors to be removed; they are signals to be investigated, justified and reported.

Why outliers matter for reproducibility

A single extreme value can inflate a mean, balloon a variance or drag a regression line toward itself, changing a study’s conclusions. Because the decision about whether to keep or exclude such a point is a researcher degree of freedom, undocumented outlier handling is a well-known threat to reproducibility. Transparent reporting of what you found, what you did and why is the antidote.

Two causes: error versus genuine extreme

Outliers arise from two broad sources, and the cause dictates the response.

Error outliers come from data-entry mistakes, instrument faults, unit mix-ups or sampling problems. A recorded human age of 250 years is an error. These can legitimately be corrected or excluded once verified.
Genuine extremes are real but unusual observations — a true high earner in an income survey, a rare strong responder in a trial. These carry information and should generally be retained, possibly with a robust analysis.

The crucial point is that you cannot tell the two apart from the number alone. Investigation of the source — the raw record, the instrument log, the data-collection notes — is what separates them.

Detection methods

Several established methods flag candidate outliers. None is definitive; each makes assumptions and each has a different sensitivity. Visual inspection should always accompany any rule.

Method	How it works	Best suited to
Z-score	Flags points whose distance from the mean exceeds a threshold of standard deviations (commonly 3)	Roughly normal, larger samples
IQR / boxplot	Flags points beyond Q1 − 1.5×IQR or Q3 + 1.5×IQR	Skewed data; robust, distribution-light
Grubbs’ test	Formal hypothesis test for a single outlier in a normal sample	One suspected outlier, normality assumed
Modified z-score (MAD)	Uses the median and median absolute deviation, resisting masking	Small samples or multiple outliers

The z-score is intuitive but breaks down precisely when it matters most: a strong outlier inflates the standard deviation and can mask itself. The IQR rule, built on quartiles, is more robust and makes few distributional assumptions, which is why the boxplot remains the everyday workhorse. Grubbs’ test offers a formal, probabilistic answer when a single outlier is suspected in approximately normal data. Robust alternatives based on the median and MAD resist the masking and swamping that trip up mean-based rules.

Principled handling: never delete silently

The cardinal rule is that you do not quietly drop inconvenient points. A defensible workflow looks like this:

Detect and flag candidates using a pre-specified rule, ideally chosen before seeing the results.
Investigate the source to classify each as error or genuine extreme.
Decide and document — correct verified errors, retain genuine extremes, and record every decision with its rationale.
Report sensitivity — run the analysis with and without the contested points and show whether conclusions change.
Prefer robust methods where extremes are genuine, such as medians, trimmed means or rank-based tests, instead of deletion.

Pre-registering the outlier rule removes the temptation to choose a definition that produces a desired result. For more on transparent analysis decisions see our reproducibility coverage and the CASRAI dictionary. Software choices also shape how outliers are detected and reported — see our review of statistical software in research.

Frequently asked questions

Should I always remove outliers?

No. Removing outliers by default is one of the most common analytic errors. Verified data-entry errors can be corrected or excluded, but genuine extreme values usually contain information and should be retained, often with a robust method. Always report what you did either way.

Which detection method is best?

There is no universal best. The IQR/boxplot rule is robust and assumption-light for skewed data; the z-score suits larger, roughly normal samples; Grubbs’ test is appropriate for a single suspected outlier under normality. Combine a numeric rule with a plot.

How do I report outlier handling?

State the detection rule, how many points were flagged, how each was classified, what action was taken and why, and the result of a sensitivity analysis with and without them. This level of detail is what makes the analysis reproducible. Our author guidance covers transparent methods reporting.

Do outliers affect meta-analyses too?

Yes. An aberrant study can dominate a pooled estimate just as a point dominates a sample. Sensitivity and influence analyses are standard, as discussed in our explainer on systematic reviews versus meta-analyses.

Related editorial in this domain

More on Reproducibility and computational research

20 Jun 2026

Reporting Molecular Methods: PCR, qPCR and the MIQE Guidelines

PCR and quantitative PCR are core molecular methods, and the MIQE guidelines define what must be reported for results to be reproducible. This guide explains PCR at a high level and the minimum information MIQE requires for transparent qPCR experiments.

20 Jun 2026

PRISMA: The 2020 Reporting Standard for Systematic Reviews and Meta-Analyses

PRISMA is the Preferred Reporting Items for Systematic reviews and Meta-Analyses, a reporting standard whose 2020 update supplies a 27-item checklist and a flow diagram so that reviews are transparent, complete and reproducible by other researchers.

20 Jun 2026

Gene Expression Data: MIAME and MINSEQE Reporting Standards

Gene expression experiments are only reproducible when their data is fully reported. This guide explains gene expression at a high level and the MIAME and MINSEQE minimum-information standards, plus the public repositories GEO and ArrayExpress that hold the data.