FAIR Principles in Data Management: Not Open

The FAIR principles in data management — Findable, Accessible, Interoperable, Reusable — do not require research data to be open access. FAIR governs how well data can be located, understood, and reused by humans and machines; openness governs who is permitted to view it. A dataset can be fully FAIR while remaining restricted, embargoed, or access-controlled, provided its existence, metadata, and reuse conditions are transparently discoverable.

FAIR data is data whose associated metadata and access conditions satisfy the Findable, Accessible, Interoperable, and Reusable criteria first published by Wilkinson et al. in Scientific Data (Nature, 2016) — independent of whether the underlying data file itself is open, embargoed, or permission-gated.

This distinction matters for institutional research offices drafting data management plans (DMPs), because conflating the two leads to two costly errors: unnecessarily forcing sensitive or proprietary data into open repositories, or wrongly assuming that keeping data closed exempts a project from FAIR compliance obligations.

What Do the FAIR Principles in Data Management Actually Require?

The FAIR guiding principles set out 15 sub-principles across four categories, published by Wilkinson et al. in Scientific Data in 2016 and maintained in detail by the GO FAIR initiative. None of the 15 sub-principles use the word “open” as a requirement; the closest is A1.1, which requires the access protocol — not the data itself — to be open, free, and universally implementable.

Principle Sub-principles Core requirement
Findable F1–F4 Globally unique persistent identifier (e.g. a DataCite DOI), rich metadata, and indexing in a searchable registry
Accessible A1, A1.1, A1.2, A2 Retrievable via a standardised, open protocol, with authentication/authorisation permitted where necessary; metadata remain accessible even if data are removed
Interoperable I1–I3 Machine-actionable, standardised vocabularies and qualified references between records
Reusable R1, R1.1, R1.2, R1.3 Rich attributes, a clear usage licence, documented provenance, and adherence to community standards

Note the wording of A1.2 explicitly: the protocol “allows for an authentication and authorisation procedure, where necessary.” This is a built-in accommodation for controlled-access data — not an exception to FAIR, but part of its original design.

Why Does “Findable” Not Mean “Open”?

A dataset is findable when its persistent identifier and descriptive metadata are indexed in a searchable resource — a data repository, catalogue, or registry — regardless of who may subsequently access the underlying file. Findability is a property of the metadata record, not of the data payload.

This is why a controlled-access dataset — for example, a clinical trial dataset held under a data access committee — can score highly on FAIR maturity assessments while remaining entirely closed to the public. The metadata (title, authors, methodology, licence terms, and the process for requesting access) are openly indexed and machine-readable; only the raw data behind that record is gated.

  • FAIR-compliant but closed: a genomics dataset with a DOI, full metadata, and a documented data access committee procedure, but no public download.
  • Open but not FAIR: a spreadsheet uploaded to a general file-sharing link with no persistent identifier, no licence, and no standardised metadata — publicly downloadable, yet effectively unfindable and unreusable at scale.
  • FAIR and open: a dataset in a certified repository (e.g. one assigning DataCite DOIs) with a Creative Commons licence and full public download access.

The Global Indigenous Data Alliance’s CARE Principles for Indigenous Data Governance (2019) — Collective Benefit, Authority to Control, Responsibility, Ethics — were published explicitly to complement FAIR, on the grounds that FAIR’s technical focus on findability and reuse does not, by itself, address who should decide whether data is shared at all. That governance question sits outside FAIR’s scope entirely, whether or not the data ends up open.

How Do NIH, UKRI, and Horizon Europe Apply FAIR Without Mandating Open Access?

Funder data policies consistently separate the FAIR requirement from any open-access requirement, even where both appear in the same data management plan template.

Funder / policy FAIR requirement Open-access requirement
NIH — 2023 Data Management and Sharing (DMS) Policy (effective 25 January 2023) DMPs must describe how data will be made findable and reusable via an established repository Explicitly permits controlled or restricted access where privacy, legal, or ethical constraints apply
Horizon Europe — Grant Agreement DMP obligation (Article 17) Data management plans must apply FAIR data principles to all research outputs Follows the “as open as possible, as closed as necessary” standard — openness is a default preference, not an absolute rule
UKRI — Common Principles on Data Policy Data outputs must be discoverable and citable with appropriate metadata Allows embargo periods and access restrictions for sensitive, commercial, or third-party data

The recurring pattern across all three frameworks: FAIR compliance is treated as a baseline stewardship obligation, while the openness decision is a separate, case-by-case judgement based on privacy, safety, intellectual property, or commercial sensitivity.

Common Questions on FAIR and Openness

What are FAIR data principles?

The FAIR data principles are a set of guidelines — Findable, Accessible, Interoperable, and Reusable — published by Wilkinson et al. in 2016 to improve how digital research data is described, indexed, and reused by both humans and machines, independent of any decision about public accessibility.

What are the four pillars of the FAIR data principles?

The four pillars are Findable (persistent identifiers and rich metadata), Accessible (retrievable via a standard protocol, with authentication permitted), Interoperable (machine-actionable, standardised vocabularies), and Reusable (clear licensing and documented provenance).

What are FAIR principles for data stewardship?

For data stewardship, FAIR principles function as an audit checklist covering identifier assignment, metadata richness, licensing clarity, and provenance documentation — the operational tasks a data steward performs regardless of whether the resulting dataset is ultimately published openly or held under access controls.

Does the NIH Data Management and Sharing Policy require FAIR data to be open?

No. The NIH’s 2023 DMS Policy requires plans to maximise appropriate sharing but explicitly accommodates restricted or controlled-access sharing for datasets involving privacy, legal, or proprietary constraints, while still expecting FAIR-aligned metadata and repository deposit.

What This Means for Research Administrators

Research offices reviewing data management plans should treat FAIR compliance and openness decisions as two separate checklist items, not one. A DMP that proposes controlled access is not automatically non-compliant with funder FAIR expectations, provided the metadata record, persistent identifier, licence terms, and access procedure are all documented.

  • Verify the plan assigns a persistent identifier (DOI or equivalent) regardless of the access tier chosen.
  • Confirm metadata will remain publicly indexed even if the dataset itself is embargoed or gated.
  • Check that licensing and provenance are documented for reuse — this is a Reusable-principle requirement, not an openness requirement.
  • Do not require a researcher to open sensitive data as a condition of “meeting FAIR” — the correct fix is usually better metadata, not looser access.

As funders sharpen their FAIR expectations in DMP review, this distinction will only become more consequential. Institutions that train reviewers to separate the two questions — is it findable and reusable, versus is it open — will avoid both unnecessary access disputes and genuine FAIR non-compliance.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *