Data science & AI · Reference

What is a confusion matrix?

A confusion matrix is a table that summarises the performance of a classifier by tabulating its predictions against the true labels, showing the counts of true positives, true negatives, false positives, and false negatives.

The four outcomes

For a two-class problem, every prediction falls into one of four categories. A true positive is a positive case correctly identified; a true negative is a negative case correctly identified. A false positive is a negative case wrongly flagged as positive (a "false alarm"), and a false negative is a positive case missed. Arranging these four counts in a table — actual class against predicted class — is the confusion matrix. Its value is that it distinguishes the types of error, which a single accuracy figure hides.

Metrics derived from the matrix

Many evaluation metrics are simply ratios of the four cell counts. Accuracy is the proportion of all predictions that are correct. Precision is the proportion of predicted positives that are truly positive, and recall is the proportion of actual positives that are correctly found.

The F1 score combines precision and recall. Because accuracy alone can mislead on imbalanced data, these matrix-derived metrics give a fuller picture of where a classifier succeeds and fails.

Why the type of error matters

A confusion matrix matters most when the costs of different errors differ. In a medical screening test, a false negative (missing a real case) may be far more serious than a false positive (a false alarm later ruled out); in a spam filter, the reverse may hold. Reporting overall accuracy hides this. By laying out false positives and false negatives separately, the confusion matrix lets analysts judge a classifier against the real-world consequences of each kind of mistake. It also extends naturally to more than two classes.

Confusion matrices in research

In research that uses classification, reporting a confusion matrix — or the metrics derived from it — is standard practice, because a single accuracy number can be misleading, especially when classes are imbalanced. The matrix should be computed on held-out test data, not the training set. Reporting the full matrix lets readers derive whichever metric matters for their context and assess the model's behaviour on each class rather than only in aggregate.

Key facts

At a glance

Definition: table of predicted vs actual class counts
Binary cells: true positive, true negative, false positive, false negative
False positive: negative wrongly flagged as positive
False negative: positive case missed
Derived metrics: accuracy, precision, recall, F1
Reveals the type of error, not just the count

Common questions

FAQ

What do true positive and false positive mean?+

A true positive is a positive case the classifier correctly identifies. A false positive is a negative case the classifier wrongly labels as positive — a false alarm. The confusion matrix counts both, alongside true negatives and false negatives.

Why is a confusion matrix better than accuracy alone?+

Accuracy gives a single figure that can hide important detail, especially with imbalanced classes. A confusion matrix separates the kinds of error — false positives and false negatives — so you can see exactly where a classifier goes wrong and weigh errors by their real-world cost.

Can a confusion matrix be used for more than two classes?+

Yes. For multiple classes the matrix becomes a larger square table, with rows for actual classes and columns for predicted classes. The diagonal holds correct predictions and the off-diagonal cells show which classes are confused with which.

Going deeper

Related on CASRAI

Sources

The step most authors miss

Doing CRediT right? Don’t stop at the statement.

A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.

Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.