Definition · Plain-language
Data minimisation
Data minimisation is the principle of collecting and retaining only the personal data that is adequate, relevant and limited to what is necessary for a purpose.
The step most authors miss
Doing CRediT right? Don’t stop at the statement.
A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.
Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.
Adequate, relevant and limited
Under GDPR Article 5(1)(c), personal data must be adequate, relevant and limited to what is necessary in relation to the purposes for which it is processed. Each word does work: adequate means enough to fulfil the stated purpose; relevant means with a rational link to that purpose; and limited means no more than necessary. In practice this means deciding in advance what data is genuinely needed and resisting the temptation to collect extra information simply because it might one day be useful. Collecting data just in case runs directly against the principle.
Why it reduces risk
Data minimisation lowers privacy risk in a straightforward way: data that is never collected cannot be breached, misused or re-identified. By holding less personal data, and holding it for no longer than needed, an organisation shrinks the harm a security incident could cause and simplifies its obligations. The principle works hand in hand with storage limitation, which requires data not to be kept longer than necessary, and with purpose limitation, which ties data to the reason it was gathered. Together these principles discipline how much personal data a project accumulates.
Minimisation in research practice
In research, data minimisation shapes study design: collecting only the variables needed to answer the research question, using identifiers only where essential, and considering whether de-identified or anonymised data would suffice. It complements privacy by design, since minimising data is one of the clearest ways to build protection in from the outset. Applied well, minimisation also supports FAIR and open-data goals, because datasets carrying only necessary, well-justified fields are easier to share responsibly and less likely to expose participants.
Key facts
At a glance
- Definition: collecting only data adequate, relevant and limited to what is necessary
- Source: GDPR Article 5(1)(c)
- Three tests: adequate, relevant, and limited to the purpose
- Effect: data not collected cannot be breached or misused
- Related principles: purpose limitation and storage limitation
- Practice: avoid collecting personal data just in case
Common misconceptions
What people often get wrong
Often heard: Data minimisation means collecting as little data as physically possible.
Actually: Minimisation means collecting data that is adequate, relevant and limited to what is necessary for the purpose — not the absolute least conceivable. Data must still be sufficient to achieve the stated aim; the test is necessity, not bare minimum.
Often heard: It is fine to collect extra personal data in case it proves useful later.
Actually: Collecting data just in case conflicts with the minimisation principle. Personal data should be limited to what is necessary for the current, specified purpose, not gathered speculatively for unspecified future use.
Often heard: Data minimisation only concerns how much data you collect at the start.
Actually: Minimisation also covers retention: data should be limited over time, not kept indefinitely. It works with storage limitation, so organisations should delete or anonymise data once it is no longer necessary.
Going deeper







