FAIR Principles for Research Data Explained

FAIR data refers to research data managed according to four guiding principles — Findable, Accessible, Interoperable and Reusable — designed to maximise the value of data for both humans and machines. The principles were set out by Mark Wilkinson and colleagues in a landmark 2016 paper in Scientific Data and have since been adopted widely by funders, publishers and research institutions as a benchmark for good data stewardship. FAIR describes how data should be described, shared and preserved so that it can be discovered and reused long after a project ends.

A common misconception is that FAIR means “open”. It does not. FAIR is about good management and clear conditions of use; data can be FAIR while access remains controlled, which matters for sensitive or personal data.

What each principle means

The four principles work together, and the order spells the acronym rather than a strict sequence. Each rests heavily on metadata and persistent identifiers.

Principle Core idea Key enablers
Findable Data and metadata are easy to locate by humans and machines Persistent identifiers (e.g. DOIs), rich metadata, indexing
Accessible Once found, data can be retrieved by a clear, open protocol Standard protocols; metadata stays available even if data are restricted
Interoperable Data can be combined and used with other data and systems Shared vocabularies, standard formats, controlled terminologies
Reusable Data are richly described and licensed for reuse Clear licences, provenance, community standards and metadata

Findable requires that data and metadata carry globally unique, persistent identifiers and are described well enough to be indexed and searched. Accessible means the data can be retrieved using a standardised, open communication protocol, with authentication where needed — and, importantly, that metadata remain accessible even when the underlying data are not. Interoperable calls for data to use shared, standard formats and vocabularies so they can be integrated with other datasets and processed by different systems. Reusable requires rich description, clear provenance and an explicit usage licence so others can confidently build on the data.

The role of persistent identifiers and metadata

Two enablers run through all four principles: persistent identifiers and metadata. A persistent identifier — such as a DOI for a dataset or an ORCID for a researcher — provides a stable, resolvable reference that does not break when URLs change, underpinning findability and provenance. Metadata — structured information describing what the data are, how they were produced, and under what terms they may be used — is what makes data discoverable, interpretable and reusable. Crucially, FAIR treats metadata as valuable in its own right: rich, standardised metadata can remain open and findable even when the dataset itself is access-controlled. This is precisely the kind of standardised description that shared vocabularies, such as the CASRAI dictionary, and broader data infrastructure are built to support.

FAIR versus open

FAIR and open are related but distinct. Open data is data anyone can freely access, use and redistribute. FAIR data is well-managed, well-described data with clear access conditions — which may or may not be open. The principles’ own phrasing, “as open as possible, as closed as necessary”, captures the balance: maximise reuse while respecting legitimate constraints such as privacy, consent, commercial sensitivity or indigenous data rights. A dataset of patient records can be made FAIR — richly described, identified, governed and licensed — without being openly downloadable. Conversely, dumping a file online makes it open but not necessarily FAIR if it lacks identifiers, metadata or a licence.

For researchers, adopting FAIR practice means assigning identifiers, writing good metadata, using standard formats and stating licences from the outset rather than at the end of a project. Guidance on preparing and describing data is available in our resources for authors, and FAIR data underpins the reproducibility goals discussed across our research-outputs coverage.

Frequently asked questions

What does FAIR stand for?

FAIR stands for Findable, Accessible, Interoperable and Reusable. The four principles, published by Wilkinson and colleagues in 2016, describe how research data and metadata should be managed so they can be discovered, retrieved, combined and reused effectively by both humans and machines.

Does FAIR mean the same as open data?

No. Open data can be freely accessed and reused by anyone, whereas FAIR data is well-described and well-managed with clear access conditions that may be restricted. The guiding phrase is “as open as possible, as closed as necessary”, so sensitive data can still be FAIR.

Why are persistent identifiers important for FAIR data?

Persistent identifiers such as DOIs and ORCIDs provide stable, resolvable references that do not break when web addresses change. They underpin findability and provenance, letting data, researchers and outputs be reliably located and credited over the long term.

Can data be FAIR without being publicly downloadable?

Yes. FAIR requires clear access protocols and rich metadata, not unrestricted access. Metadata can remain findable and accessible even when the underlying dataset is controlled, so sensitive datasets can be made FAIR while access stays appropriately governed.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *