Definition · Plain-language
Correlation coefficient
A correlation coefficient measures the strength and direction of the linear relationship between two variables; Pearson’s r ranges from −1 (perfect negative) through 0 (none) to +1 (perfect positive).
The step most authors miss
Doing CRediT right? Don’t stop at the statement.
A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.
Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.
The −1 to +1 scale
A correlation coefficient is a single number summarising how two variables move together. Pearson’s r, the most widely used, is bounded between −1 and +1. The sign indicates direction: a positive value means the variables rise together, while a negative value means one falls as the other rises. The magnitude indicates strength: values near ±1 indicate a tight linear relationship in which the points lie close to a straight line, while values near 0 indicate little or no linear relationship. An r of exactly +1 or −1 means the points fall perfectly on a line; an r of 0 means no straight-line trend at all.
Judging strength and direction
Interpreting the magnitude depends on context and discipline, but a common informal guide treats values around ±0.1 as weak, ±0.3 as moderate and ±0.5 or above as strong, with the social sciences often tolerating smaller correlations than the physical sciences. The crucial point is that strength and direction are read separately: an r of −0.8 describes a strong relationship, just a negative one, and is stronger than an r of +0.4. Because r measures only the linear component, it should always be interpreted alongside a scatterplot, which reveals curvature, clusters or influential points that a single number conceals.
The coefficient of determination, r²
Squaring the correlation coefficient gives r², the coefficient of determination. Conceptually, r² represents the proportion of the variance in one variable that is shared with, or statistically explained by, the other in a linear relationship. An r of 0.7, for instance, gives r² = 0.49, meaning about 49% of the variation is shared. Because squaring removes the sign, r² is always between 0 and 1 and says nothing about direction. It usefully tempers interpretation: a respectable-looking r of 0.5 corresponds to only r² = 0.25, so just a quarter of the variation is accounted for.
Linear only, and not causation
Two important limits govern the correlation coefficient. First, Pearson’s r captures only linear relationships. A strong but curved relationship can yield an r near zero, so a low coefficient does not prove the variables are unrelated — only that they are not linearly related. Second, and most fundamentally, correlation does not imply causation. A high correlation may arise because one variable causes the other, because both are driven by a confounding third variable, or by coincidence. Establishing cause requires controlled study designs, not a correlation coefficient alone, however large it is.
Key facts
At a glance
- Definition: a measure of the strength and direction of a linear relationship
- Common form: Pearson’s correlation coefficient, r
- Range: −1 to +1, where 0 means no linear relationship
- Sign: positive means variables rise together; negative means they move oppositely
- r squared: the proportion of shared (explained) variance, between 0 and 1
- Key caveat: measures linear association only and does not imply causation
Common misconceptions
What people often get wrong
Often heard: A high correlation coefficient proves that one variable causes the other.
Actually: Correlation does not imply causation. A strong r can reflect a confounding variable or coincidence; establishing cause needs a controlled design, not a correlation alone.
Often heard: A correlation coefficient near zero means the two variables are unrelated.
Actually: It means no linear relationship. A strong non-linear (for example, U-shaped) relationship can still produce an r near zero, which is why a scatterplot should always be inspected.
Often heard: A negative correlation is weaker or worse than a positive one.
Actually: The sign only gives direction. An r of −0.8 indicates a stronger relationship than +0.4; strength is read from the magnitude, regardless of sign.







