Skip to main content
v2026.1714 entries · CC-BY 4.0
CASRAI

Definition · Plain-language

Data minimisation

Data minimisation is the principle of collecting and retaining only the personal data that is adequate, relevant and limited to what is necessary for a purpose.

CASRAI research-methods explainer — Data minimisation

The step most authors miss

Doing CRediT right? Don’t stop at the statement.

A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.

Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.

Adequate, relevant and limited

Under GDPR Article 5(1)(c), personal data must be adequate, relevant and limited to what is necessary in relation to the purposes for which it is processed. Each word does work: adequate means enough to fulfil the stated purpose; relevant means with a rational link to that purpose; and limited means no more than necessary. In practice this means deciding in advance what data is genuinely needed and resisting the temptation to collect extra information simply because it might one day be useful. Collecting data just in case runs directly against the principle.

Why it reduces risk

Data minimisation lowers privacy risk in a straightforward way: data that is never collected cannot be breached, misused or re-identified. By holding less personal data, and holding it for no longer than needed, an organisation shrinks the harm a security incident could cause and simplifies its obligations. The principle works hand in hand with storage limitation, which requires data not to be kept longer than necessary, and with purpose limitation, which ties data to the reason it was gathered. Together these principles discipline how much personal data a project accumulates.

Minimisation in research practice

In research, data minimisation shapes study design: collecting only the variables needed to answer the research question, using identifiers only where essential, and considering whether de-identified or anonymised data would suffice. It complements privacy by design, since minimising data is one of the clearest ways to build protection in from the outset. Applied well, minimisation also supports FAIR and open-data goals, because datasets carrying only necessary, well-justified fields are easier to share responsibly and less likely to expose participants.

Key facts

At a glance

  • Definition: collecting only data adequate, relevant and limited to what is necessary
  • Source: GDPR Article 5(1)(c)
  • Three tests: adequate, relevant, and limited to the purpose
  • Effect: data not collected cannot be breached or misused
  • Related principles: purpose limitation and storage limitation
  • Practice: avoid collecting personal data just in case

Common misconceptions

What people often get wrong

Often heard: Data minimisation means collecting as little data as physically possible.

Actually: Minimisation means collecting data that is adequate, relevant and limited to what is necessary for the purpose — not the absolute least conceivable. Data must still be sufficient to achieve the stated aim; the test is necessity, not bare minimum.

Often heard: It is fine to collect extra personal data in case it proves useful later.

Actually: Collecting data just in case conflicts with the minimisation principle. Personal data should be limited to what is necessary for the current, specified purpose, not gathered speculatively for unspecified future use.

Often heard: Data minimisation only concerns how much data you collect at the start.

Actually: Minimisation also covers retention: data should be limited over time, not kept indefinitely. It works with storage limitation, so organisations should delete or anonymise data once it is no longer necessary.

Referenced across the research world

University of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logoUniversity of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logo
  • University of Cambridge logo
  • Columbia University logo
  • University of Edinburgh logo
  • Harvard University logo
  • University of Oxford logo
  • Princeton University logo
  • Stanford School of Medicine logo
  • University College London logo
  • ORCID logo
  • Crossref logo

View CASRAI adoption →