Definition · Plain-language

Correlation

Q: If two variables are correlated, one must cause the other.

No. Correlation does not imply causation. The link may be reversed, driven by a third confounding variable, or coincidental. Establishing cause requires a controlled experiment or a design that rules out alternatives.

Q: A correlation coefficient near zero means the variables are unrelated.

Only linearly. Pearson’s r measures linear association, so two variables related in a strong curved pattern can still have an r near zero. A scatter plot is needed to see non-linear relationships.

Q: A negative correlation means there is no relationship or a weak one.

No. A negative correlation is a real relationship in which one variable rises as the other falls. Its strength depends on the magnitude of r, not its sign — r = −0.9 is a strong relationship.

Correlation measures the strength and direction of the linear association between two variables, summarised by a correlation coefficient that ranges from −1 to +1.

CASRAI research-methods explainer — Correlation

The step most authors miss

Doing CRediT right? Don’t stop at the statement.

A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.

Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.

Direction: positive, negative or zero

Correlation captures how two variables move together. In a positive correlation, they move in the same direction — as one increases, so does the other, such as height and weight. In a negative (inverse) correlation, they move in opposite directions — as one increases, the other decreases, such as price and quantity demanded. A correlation near zero means there is no consistent linear relationship: knowing one variable tells you little about the other. The sign of the correlation coefficient encodes this direction directly.

Strength and the coefficient

The strength of a correlation is summarised by a coefficient, most commonly Pearson’s r, which ranges from −1 to +1. Values near +1 or −1 indicate a strong linear relationship, where points cluster tightly around a straight line; values near zero indicate a weak one. The sign shows direction and the magnitude shows strength, so r = −0.9 is a strong negative relationship and r = 0.2 a weak positive one. Pearson’s r measures only linear association — two variables can be strongly related in a curved pattern yet have an r near zero.

Correlation is not causation

A correlation between two variables does not establish that one causes the other. The relationship may run the other way, both may be driven by a third confounding variable, or the association may be coincidental. Ice-cream sales correlate with drowning deaths, but neither causes the other — hot weather drives both. Establishing causation requires more than correlation: a controlled experiment, or careful design that rules out alternative explanations. Treating correlation as proof of cause is one of the most common errors in interpreting data.

Key facts

At a glance

Definition: strength and direction of a linear association between two variables
Coefficient: Pearson’s r, ranging from −1 to +1
Positive: variables rise and fall together (r > 0)
Negative: one rises as the other falls (r < 0)
Zero: no consistent linear relationship (r ≈ 0)
Key caveat: correlation does not prove causation

Common misconceptions

What people often get wrong

Often heard: If two variables are correlated, one must cause the other.

Actually: No. Correlation does not imply causation. The link may be reversed, driven by a third confounding variable, or coincidental. Establishing cause requires a controlled experiment or a design that rules out alternatives.

Often heard: A correlation coefficient near zero means the variables are unrelated.

Actually: Only linearly. Pearson’s r measures linear association, so two variables related in a strong curved pattern can still have an r near zero. A scatter plot is needed to see non-linear relationships.

Often heard: A negative correlation means there is no relationship or a weak one.

Actually: No. A negative correlation is a real relationship in which one variable rises as the other falls. Its strength depends on the magnitude of r, not its sign — r = −0.9 is a strong relationship.

Going deeper

Related CASRAI guidance

Correlation coefficient →Correlation vs causation →Linear regression →Effect size →Statistics hub →Standards dictionary →