Skip to main content
v2026.1714 entries · CC-BY 4.0
CASRAI

Definition · Plain-language

Correlation coefficient

A correlation coefficient measures the strength and direction of the linear relationship between two variables; Pearson’s r ranges from −1 (perfect negative) through 0 (none) to +1 (perfect positive).

CASRAI research-methods explainer — Correlation coefficient

The step most authors miss

Doing CRediT right? Don’t stop at the statement.

A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.

Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.

The −1 to +1 scale

A correlation coefficient is a single number summarising how two variables move together. Pearson’s r, the most widely used, is bounded between −1 and +1. The sign indicates direction: a positive value means the variables rise together, while a negative value means one falls as the other rises. The magnitude indicates strength: values near ±1 indicate a tight linear relationship in which the points lie close to a straight line, while values near 0 indicate little or no linear relationship. An r of exactly +1 or −1 means the points fall perfectly on a line; an r of 0 means no straight-line trend at all.

Judging strength and direction

Interpreting the magnitude depends on context and discipline, but a common informal guide treats values around ±0.1 as weak, ±0.3 as moderate and ±0.5 or above as strong, with the social sciences often tolerating smaller correlations than the physical sciences. The crucial point is that strength and direction are read separately: an r of −0.8 describes a strong relationship, just a negative one, and is stronger than an r of +0.4. Because r measures only the linear component, it should always be interpreted alongside a scatterplot, which reveals curvature, clusters or influential points that a single number conceals.

The coefficient of determination, r²

Squaring the correlation coefficient gives r², the coefficient of determination. Conceptually, r² represents the proportion of the variance in one variable that is shared with, or statistically explained by, the other in a linear relationship. An r of 0.7, for instance, gives r² = 0.49, meaning about 49% of the variation is shared. Because squaring removes the sign, r² is always between 0 and 1 and says nothing about direction. It usefully tempers interpretation: a respectable-looking r of 0.5 corresponds to only r² = 0.25, so just a quarter of the variation is accounted for.

Linear only, and not causation

Two important limits govern the correlation coefficient. First, Pearson’s r captures only linear relationships. A strong but curved relationship can yield an r near zero, so a low coefficient does not prove the variables are unrelated — only that they are not linearly related. Second, and most fundamentally, correlation does not imply causation. A high correlation may arise because one variable causes the other, because both are driven by a confounding third variable, or by coincidence. Establishing cause requires controlled study designs, not a correlation coefficient alone, however large it is.

Key facts

At a glance

  • Definition: a measure of the strength and direction of a linear relationship
  • Common form: Pearson’s correlation coefficient, r
  • Range: −1 to +1, where 0 means no linear relationship
  • Sign: positive means variables rise together; negative means they move oppositely
  • r squared: the proportion of shared (explained) variance, between 0 and 1
  • Key caveat: measures linear association only and does not imply causation

Common misconceptions

What people often get wrong

Often heard: A high correlation coefficient proves that one variable causes the other.

Actually: Correlation does not imply causation. A strong r can reflect a confounding variable or coincidence; establishing cause needs a controlled design, not a correlation alone.

Often heard: A correlation coefficient near zero means the two variables are unrelated.

Actually: It means no linear relationship. A strong non-linear (for example, U-shaped) relationship can still produce an r near zero, which is why a scatterplot should always be inspected.

Often heard: A negative correlation is weaker or worse than a positive one.

Actually: The sign only gives direction. An r of −0.8 indicates a stronger relationship than +0.4; strength is read from the magnitude, regardless of sign.

Referenced across the research world

University of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logoUniversity of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logo
  • University of Cambridge logo
  • Columbia University logo
  • University of Edinburgh logo
  • Harvard University logo
  • University of Oxford logo
  • Princeton University logo
  • Stanford School of Medicine logo
  • University College London logo
  • ORCID logo
  • Crossref logo

View CASRAI adoption →