Definition · Plain-language
Correlation
Correlation measures the strength and direction of the linear association between two variables, summarised by a correlation coefficient that ranges from −1 to +1.
The step most authors miss
Doing CRediT right? Don’t stop at the statement.
A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.
Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.
Direction: positive, negative or zero
Correlation captures how two variables move together. In a positive correlation, they move in the same direction — as one increases, so does the other, such as height and weight. In a negative (inverse) correlation, they move in opposite directions — as one increases, the other decreases, such as price and quantity demanded. A correlation near zero means there is no consistent linear relationship: knowing one variable tells you little about the other. The sign of the correlation coefficient encodes this direction directly.
Strength and the coefficient
The strength of a correlation is summarised by a coefficient, most commonly Pearson’s r, which ranges from −1 to +1. Values near +1 or −1 indicate a strong linear relationship, where points cluster tightly around a straight line; values near zero indicate a weak one. The sign shows direction and the magnitude shows strength, so r = −0.9 is a strong negative relationship and r = 0.2 a weak positive one. Pearson’s r measures only linear association — two variables can be strongly related in a curved pattern yet have an r near zero.
Correlation is not causation
A correlation between two variables does not establish that one causes the other. The relationship may run the other way, both may be driven by a third confounding variable, or the association may be coincidental. Ice-cream sales correlate with drowning deaths, but neither causes the other — hot weather drives both. Establishing causation requires more than correlation: a controlled experiment, or careful design that rules out alternative explanations. Treating correlation as proof of cause is one of the most common errors in interpreting data.
Key facts
At a glance
- Definition: strength and direction of a linear association between two variables
- Coefficient: Pearson’s r, ranging from −1 to +1
- Positive: variables rise and fall together (r > 0)
- Negative: one rises as the other falls (r < 0)
- Zero: no consistent linear relationship (r ≈ 0)
- Key caveat: correlation does not prove causation
Common misconceptions
What people often get wrong
Often heard: If two variables are correlated, one must cause the other.
Actually: No. Correlation does not imply causation. The link may be reversed, driven by a third confounding variable, or coincidental. Establishing cause requires a controlled experiment or a design that rules out alternatives.
Often heard: A correlation coefficient near zero means the variables are unrelated.
Actually: Only linearly. Pearson’s r measures linear association, so two variables related in a strong curved pattern can still have an r near zero. A scatter plot is needed to see non-linear relationships.
Often heard: A negative correlation means there is no relationship or a weak one.
Actually: No. A negative correlation is a real relationship in which one variable rises as the other falls. Its strength depends on the magnitude of r, not its sign — r = −0.9 is a strong relationship.
Going deeper







