Skip to main content
v2026.1714 entries · CC-BY 4.0
CASRAI

Explainer · Plain-language

What is open data?

Open data is data that anyone can freely access, use, modify, and share for any purpose, subject at most to requirements that preserve provenance and openness. In research, open data means making the data underlying findings available so that results can be verified, reused, and built upon.

CASRAI plain-language explainers — clear answers to recurring research-administration questions

The step most authors miss

Doing CRediT right? Don’t stop at the statement.

A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.

Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.

What makes data "open"

Open data, in the sense set out by the Open Definition, is data that anyone is free to access, use, modify, and share for any purpose, subject at most to measures that preserve provenance and openness — such as attribution or share-alike. In research this means depositing the data behind a study in a place where others can find and reuse it, usually with a clear open licence so the terms of reuse are unambiguous.

Open vs FAIR — not the same thing

Open and FAIR are related but distinct. FAIR (Findable, Accessible, Interoperable, Reusable) concerns how well data are described and made machine-actionable; the "A" is about a standardised access protocol, not about being free of restrictions. Data can therefore be FAIR while access is governed — for example, sensitive data behind a controlled-access process. Open data adds the dimension of permissive licensing for reuse on top of being FAIR.

Licences and repositories

To be reusable without ambiguity, open data needs an explicit licence. Creative Commons CC BY (attribution) and the CC0 public-domain dedication are the most common choices for research data; CC0 is often recommended for data because it removes attribution-stacking and legal uncertainty. Data are typically deposited in a repository — general-purpose ones such as Zenodo, Dryad, or Figshare, or a discipline-specific repository — which mints a persistent identifier (a DataCite DOI) so the dataset is citable.

Funder mandates and the guiding principle

Many funders and publishers now expect research data to be shared openly where possible, supported by a data management plan and a data availability statement in the resulting paper. The widely used phrase "as open as possible, as closed as necessary" — prominent in European research-policy guidance — captures the balance: maximise openness for verification and reuse, while respecting legitimate constraints such as privacy, consent, commercial sensitivity, or Indigenous data rights.

Key facts

At a glance

  • Definition: Data anyone can freely access, use, modify and share
  • Licences: Commonly CC BY (attribution) or CC0 (public domain)
  • Open ≠ FAIR: FAIR is about description/access protocol, not permissions
  • Where: Repositories (Zenodo, Dryad, Figshare, domain repos) with a DOI
  • Principle: "As open as possible, as closed as necessary"
  • Drivers: Funder open-data mandates; publisher data policies

Common misconceptions

What people often get wrong

Often heard: Open data and FAIR data are the same thing.

Actually: No — FAIR concerns how findable, accessible (via a protocol), interoperable, and reusable data are. Data can be FAIR while access is restricted; open adds permissive reuse licensing.

Often heard: Open data means all research data must be public.

Actually: No — sensitive, personal, or legally constrained data should not be fully open. The principle is "as open as possible, as closed as necessary".

Often heard: Posting a file online makes it open data.

Actually: No — open data needs an explicit open licence and ideally a persistent identifier in a repository, so others know the reuse terms and can cite it reliably.

Referenced across the research world

University of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logoUniversity of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logo
  • University of Cambridge logo
  • Columbia University logo
  • University of Edinburgh logo
  • Harvard University logo
  • University of Oxford logo
  • Princeton University logo
  • Stanford School of Medicine logo
  • University College London logo
  • ORCID logo
  • Crossref logo

View CASRAI adoption →