Analysis of variance (ANOVA) is a statistical method that tests whether the means of three or more groups differ by more than would be expected from random variation alone. It does this by comparing the variance between group means against the variance within groups, summarised in a single F-statistic. ANOVA is one of the most widely used inferential tests in experimental research, and reporting it transparently is central to reproducible analysis.
Why ANOVA instead of multiple t-tests?
A t-test compares two group means. When you have three or more groups, it is tempting to run a separate t-test for every pair. The problem is the family-wise error rate: each test carries its own chance of a false positive, and those chances accumulate. With three groups there are three pairwise comparisons; at a 5% significance level the probability of at least one false positive rises to roughly 14%, and it climbs further as groups are added. ANOVA solves this by performing a single omnibus test that asks one question: are any of the group means different?
This control of error is why ANOVA underpins so much of experimental design. For a refresher on what significance thresholds mean in practice, see our explainer on p-values and statistical significance.
The F-statistic and how it works
ANOVA partitions the total variability in the data into two components. The between-groups variance reflects how far each group mean sits from the overall (grand) mean. The within-groups variance reflects the natural spread of observations inside each group. The F-statistic is the ratio of these two:
F = between-groups variance / within-groups variance
If the groups truly share a common mean, both quantities estimate the same underlying variability and F sits near 1. When real differences exist, the between-groups term grows and F rises. A large F, evaluated against the F-distribution with the appropriate degrees of freedom, yields a small p-value and signals that at least one mean differs.
One-way versus two-way ANOVA
The design depends on how many factors you are manipulating.
| Feature | One-way ANOVA | Two-way ANOVA |
|---|---|---|
| Number of factors | One independent variable | Two independent variables |
| Example question | Does diet type affect plant growth? | Do diet type and watering frequency affect plant growth? |
| Main effects | One | Two (one per factor) |
| Interaction | Not assessed | Tests whether factors combine non-additively |
| Output | Single F-statistic | F-statistic for each main effect plus interaction |
The key advantage of two-way ANOVA is the interaction effect: it reveals whether the influence of one factor depends on the level of another, something separate analyses would miss.
Assumptions you must check
ANOVA rests on three core assumptions. Observations should be independent. The residuals should be approximately normally distributed. And the groups should show roughly equal variances, a property called homogeneity of variance (homoscedasticity). When variances differ markedly, a Welch ANOVA is a robust alternative; when normality fails, a non-parametric Kruskal-Wallis test may be more appropriate. Stating which assumptions were tested, and how, is good practice and supports replication, as we discuss across our reproducibility coverage.
Post-hoc tests: locating the difference
A significant ANOVA tells you that some mean differs, but not which one. Post-hoc tests answer that follow-up while still controlling the family-wise error rate. Tukey’s HSD is the standard choice for all pairwise comparisons with equal sample sizes; Bonferroni correction is conservative and simple; Scheffe’s test is flexible for complex contrasts. Crucially, you should not revert to uncorrected t-tests after a significant ANOVA, as that reintroduces the inflated error the test was designed to prevent.
Equally important, statistical significance does not measure how large a difference is. Always pair ANOVA results with an effect size such as eta-squared, as covered in our companion piece on why effect size matters beyond significance. Authors planning a study should also budget adequate sample size and statistical power so a real effect can actually be detected.
Frequently asked questions
What does a significant ANOVA result actually tell you?
It tells you that at least one group mean differs from the others by more than chance would explain. It does not identify which groups differ or how large the difference is; you need post-hoc tests and effect sizes to answer those questions.
Can ANOVA be used for only two groups?
Yes. With two groups a one-way ANOVA gives results mathematically equivalent to an independent-samples t-test (F equals t squared). ANOVA’s real value appears with three or more groups, where it prevents the error inflation of multiple t-tests.
What is the difference between a main effect and an interaction?
A main effect is the overall influence of one factor averaged across the others. An interaction means the effect of one factor changes depending on the level of another. Detecting interactions is the principal reason to use two-way rather than one-way designs.
How should ANOVA results be reported for reproducibility?
Report the F-statistic with both degrees of freedom, the p-value, an effect size, the post-hoc method used, and confirmation that assumptions were checked. The CASRAI dictionary and our guidance for authors set out the metadata that makes such results auditable.







