Tag: five safes framework

  • Five Safes Framework: FAIR Access vs Privacy

    The Five Safes framework is a governance model — Safe People, Safe Projects, Safe Settings, Safe Data and Safe Outputs — that lets trusted research environments grant researchers FAIR access to sensitive data while keeping disclosure risk under continuous, auditable control. Rather than treating openness and privacy as opposing goals, it turns each into a checkable dimension, so a dataset can be findable and reusable in principle while remaining tightly access-controlled in practice.

    The five safes framework is a risk-management taxonomy, originated by the UK’s Office for National Statistics (ONS) and formalised in the 2010s, that decomposes data-access decisions into five independent dimensions of risk rather than a single accept/reject gate. It is the governance logic underneath most UK trusted research environments (TREs), including the UK Data Service SecureLab, ONS’s Secure Research Service, Research Data Scotland, and the network of TREs coordinated by Health Data Research UK (HDR UK).

    What is the Five Safes framework?

    The Five Safes framework was set out formally by ONS statisticians Felix Ritchie and Tanvi Desai, whose 2016 working paper “Five Safes: designing data access for research” is the primary methodological source most secondary explainers omit. It reframes data access as five separable risk dimensions rather than a binary “share or withhold” decision.

    Each dimension is assessed independently, then combined. A weakness in one — for example, less rigorously screened outputs — can be offset by tightening another, such as restricting the setting to an air-gapped enclave. This modularity is what allows the same underlying dataset to support both a low-risk aggregate release and a high-risk record-level research project, governed by different combinations of the same five controls.

    The five dimensions explained

    Each “safe” answers a distinct governance question. Together they form the checklist that a trusted research environment applies before, during and after a project.

    Dimension Core question Typical TRE control
    Safe People Is the researcher trustworthy and trained? Accreditation, Safe Researcher Training, signed data-access agreements
    Safe Projects Is the proposed use ethical, lawful and in the public interest? Independent approvals panel, ethics review, public-benefit test
    Safe Settings Is the technical environment controlled? Air-gapped enclave, no local downloads, logged sessions
    Safe Data Has disclosure risk in the dataset itself been reduced? De-identification, pseudonymisation, statistical perturbation
    Safe Outputs Could anything leaving the environment re-identify someone? Manual/automated output-checking against small-cell disclosure rules

    No single safe carries the whole burden. Under the Five Safes model, a dataset that cannot be fully anonymised can still be used safely if the setting, people and outputs are controlled tightly enough to compensate — the logic that underwrites most modern TRE design.

    Five Safes in NHS secure data environments

    The 2022 Goldacre Review, Better, Broader, Safer: Using Health Data for Research and Analysis, recommended that NHS data for research move away from dissemination of pseudonymised extracts and into Five Safes-governed trusted research environments by default. NHS England’s subsequent secure data environment (SDE) policy, published as part of the Data Saves Lives strategy, requires that access to NHS health and care data for research and planning purposes take place inside approved SDEs rather than through bulk data transfers.

    This is Five Safes applied at national scale: Safe Settings replaces the old model of emailing or shipping extracts; Safe People and Safe Projects are enforced through SDE accreditation and project approval panels; Safe Outputs is enforced through statistical disclosure control before any result leaves the environment. HDR UK’s federated TRE network and NHS England’s regional sub-national secure data environments both operate on this same five-dimension logic.

    Reconciling FAIR access with disclosure control

    The FAIR principles — Findable, Accessible, Interoperable, Reusable — were published by Wilkinson et al. in Scientific Data (2016) to improve the value of research data for both humans and machines. FAIR’s “Accessible” criterion is frequently misread as “open”; the original principles explicitly state that access can require authentication and authorisation, provided the conditions are clearly documented.

    The Five Safes framework is the mechanism that satisfies that condition for sensitive data. It does not compete with FAIR — it operationalises the “A” in FAIR for data too sensitive to release openly.

    FAIR principle Five Safes dimension that operationalises it Practical mechanism
    Findable Safe Data (metadata layer) Catalogued metadata is public even when the underlying data is not
    Accessible Safe People + Safe Projects Documented accreditation and approval routes, not open download
    Interoperable Safe Settings Standardised formats and tooling inside the controlled enclave
    Reusable Safe Outputs Disclosure-checked results and code released for onward reuse

    Under GDPR Article 89, processing special-category data for research purposes is permitted subject to appropriate safeguards. In UK practice, a Five Safes-governed trusted research environment is the safeguard: it lets institutions claim the research exemption while still meeting data-protection obligations, which is why TREs — not open repositories — are now the default access route for identifiable or quasi-identifiable datasets.

    Assessing maturity: from principles to governance

    Because the five dimensions are qualitative by design, data custodians need a way to compare TREs consistently. Administrative Data Research UK (ADR UK) has developed a Five Safes maturity model that scores environments against each dimension, moving the framework from a descriptive checklist to an auditable governance standard. Many TREs also pursue ISO/IEC 27001 information-security certification to provide independent evidence for the Safe Settings dimension specifically.

    • ONS Secure Research Service — the original Five Safes implementation
    • UK Data Service SecureLab — Five Safes applied to social science and economic microdata
    • Research Data Scotland — devolved administrative-data TRE built on the same model
    • HDR UK’s TRE network and NHS England’s sub-national SDEs — Five Safes at health-data scale

    For research administrators negotiating data-sharing agreements, the maturity model matters more than the framework name: a self-declared “Five Safes-aligned” environment is not equivalent to one independently assessed against all five dimensions.

    Common questions about the Five Safes framework

    What are the five dimensions of the Five Safes framework?

    The five dimensions are Safe People, Safe Projects, Safe Settings, Safe Data and Safe Outputs. Each is assessed and controlled separately, so weaknesses in one dimension can be offset by stricter controls in another, rather than requiring every dimension to reach maximum safety independently.

    How does the Five Safes framework work in the NHS?

    NHS secure data environments apply Five Safes by requiring accredited researchers, approved projects, and controlled technical settings instead of releasing pseudonymised data extracts. Following the 2022 Goldacre Review, NHS England’s secure data environment policy makes this the default access route for NHS health and care data used in research.

    Is a trusted research environment the same as the Five Safes framework?

    No. A trusted research environment is the technical and organisational setting — the “Safe Setting” — while the Five Safes framework is the broader governance logic covering people, projects, data and outputs as well. A TRE is one implementation of the Safe Settings dimension, not the whole model.

    How does the Five Safes framework relate to the FAIR data principles?

    The Five Safes framework operationalises FAIR’s “Accessible” principle for sensitive data that cannot be openly released. It makes metadata findable and reusable outputs disclosure-checked, while authorisation and accreditation — rather than open download — satisfy the accessibility requirement.

    Implications and outlook

    The direction of UK policy is unambiguous: dissemination of raw or lightly de-identified extracts is being phased out in favour of Five Safes-governed environments, first in health data and increasingly across administrative and social datasets held by ADR UK partners. For institutions, this means data-sharing agreements, ethics approvals and researcher training pathways increasingly need to be designed around the five dimensions from the outset, not retrofitted once a TRE is chosen.

    For publishers and funders assessing data-availability statements, understanding which of the five safes underpins a stated access route — rather than treating “available in a trusted research environment” as a single undifferentiated category — is becoming a necessary part of due diligence. The framework’s real value is not that it makes data open; it is that it makes the terms of controlled access explicit, auditable and consistent across institutions, which is the precondition FAIR access needs when the data itself cannot be.

  • Trusted Research Environments Make NHS Data FAIR

    A trusted research environment (TRE) is a secure, access-controlled computing platform that lets approved researchers analyse sensitive data — such as NHS patient records — without ever copying, downloading, or exporting the underlying data. Analysts log in remotely, run their code against the data inside the environment, and only pre-checked, aggregated outputs leave the boundary. This is the mechanism that lets sensitive health datasets stay FAIR-findable and reusable while the data itself never crosses a governance line.

    A trusted research environment is: a governed digital space in which pre-approved researchers query sensitive data under the Five Safes framework, with disclosure-checked outputs as the only route out. TREs are also known as secure data environments (SDEs), data safe havens, or secure research environments (SREs) — functionally synonymous terms, though NHS England now prefers “secure data environment” in public-facing policy as more intuitive than the technical “TRE”.

    What is a trusted research environment and how does it work?

    A TRE inverts the traditional data-sharing model. Instead of sending a dataset to a researcher’s own machine, the researcher comes to the data. Code, statistical software, and disclosure-controlled outputs move; identifiable records do not.

    • No data extraction: raw records cannot be downloaded, copied, or emailed out of the environment.
    • Pre-installed analytical tooling: statistical packages and secure workspaces sit inside the perimeter, so researchers never need a local copy.
    • Output checking: a human or automated disclosure-control review screens every result before it is released, to confirm no individual can be re-identified.

    Peer-reviewed literature describes a TRE as “an environment supported by trained staff and agreed processes… to access sensitive data” — a framing echoed across UK academic TRE documentation.

    What is the Five Safes framework?

    The Five Safes framework is the governance model almost every UK TRE uses to structure access decisions — from the Office for National Statistics’ Secure Research Service to NHS regional secure data environments. It originated at the ONS and is now standard across the UK’s public-sector research data infrastructure.

    Safe Question it answers Typical control
    Safe Projects Is the research in the public interest? Independent research/ethics review of the proposal
    Safe People Can this researcher be trusted? Accreditation, training, background checks
    Safe Settings Is the technical environment secure? No internet egress, monitored virtual desktops, audit logging
    Safe Data Is the data adequately de-identified? Pseudonymisation, aggregation, statistical disclosure control
    Safe Outputs Could the results re-identify anyone? Manual or automated output review before release

    ADR UK notes that each of its national partnerships, as well as the ONS, operates a dedicated TRE built on Five Safes principles — the de facto standard, not one option among several.

    How does the NHS secure data environment programme work?

    NHS England’s SDE policy requires that access to NHS health and social care data for research and planning be provided through accredited secure data environments, rather than by disseminating extracted, pseudonymised datasets to individual organisations. This followed the 2022 Goldacre Review, “Better, Broader, Safer: Using Health Data for Research and Analysis,” which recommended TREs become the default route for accessing NHS data rather than the exception.

    The result is a two-tier structure now operating across England:

    • NHS England’s national SDE, holding national datasets for approved research uses.
    • Sub-national secure data environments (SNSDEs), regional environments aligned to Integrated Care Systems, giving researchers access to more granular, regionally linked data.

    Devolved nations run equivalent infrastructure: the Scottish National Safe Haven, Wales’ SAIL Databank at Swansea University, and Northern Ireland’s Honest Broker Service each function as a jurisdictional TRE under comparable governance.

    How do TREs make sensitive data FAIR without moving it?

    The FAIR data principles — Findable, Accessible, Interoperable, Reusable, formalised by Wilkinson et al. in Scientific Data (2016) — were designed for open datasets that can be freely retrieved. Sensitive health data cannot satisfy FAIR in its literal, open-access sense; a TRE lets each principle apply to the metadata and governance layer instead of the raw record. This is the architectural insight most explainer content on this topic misses: FAIR does not require open data, it requires a documented, machine-actionable pathway to reuse — and a TRE supplies exactly that for data which must stay closed.

    • Findable: TREs publish dataset-level metadata in public catalogues — for example, the HDR UK Innovation Gateway — with persistent identifiers, so a dataset’s existence, structure, and provenance are discoverable even though the records inside are never exposed.
    • Accessible: “accessible” is redefined as a documented, auditable application and accreditation process (Safe People, Safe Projects) rather than an open download link — the process itself is transparent even where the data is not.
    • Interoperable: common data models and coding standards (such as OMOP or SNOMED CT mappings used across NHS TREs) let approved analyses run consistently across multiple environments, enabling federated analysis without pooling raw data in one place.
    • Reusable: version-controlled analytical code, output logs, and data dictionaries are retained and, increasingly, shared openly by researchers even when the underlying data cannot be — supporting reproducibility and future reuse of the method, if not the dataset.

    This mapping is the load-bearing argument of the TRE model: sensitive data sharing and open FAIR data are not opposites. The TRE is the governance boundary that lets FAIR’s discovery and reuse guarantees operate at the metadata and code layer while Five Safes controls operate at the record layer.

    How does OpenSAFELY demonstrate the model in practice?

    OpenSAFELY, built by researchers at the University of Oxford and the Bennett Institute for Applied Data Science in response to the COVID-19 pandemic, is the most cited working example of this architecture. Rather than extracting GP records, OpenSAFELY runs analytical code inside the secure environments of the electronic health record software suppliers themselves, executing studies against the pseudonymised primary-care record for a very large proportion of England’s registered patients — without the data ever leaving NHS-contracted infrastructure.

    Its methods and code repositories are published openly, so the analytical logic is fully FAIR — reusable and auditable by anyone — even though the patient-level data it runs against never is. That split is the clearest public demonstration of “FAIR governance, closed data” in UK health research.

    Common questions about trusted research environments

    What is the difference between an SDE and a TRE?

    An SDE and a TRE describe the same underlying architecture; SDE is the term NHS England now favours as clearer for non-specialist audiences, while TRE remains standard in academic and technical documentation, including workspace-level “research TREs” built for a single project inside a broader SDE.

    Is a data safe haven the same as a trusted research environment?

    Yes — data safe haven is an earlier, still widely used UK term for the same model, applied to environments such as the Scottish National Safe Haven. All three terms describe a controlled computing space governed by comparable accreditation, de-identification, and output-checking controls, typically under a Five Safes-style framework.

    What is required to build a trusted research environment?

    Building a compliant TRE requires an on-premises or cloud-hosted secure computing platform with no unmonitored internet egress, encrypted data at rest and in transit, role-based access controls, and a formal output-checking process — King’s College London’s CREATE TRE, for example, operates under ISO 27001 certification to evidence these controls externally.

    What does “trusted research” mean in UK government usage?

    Separately from the TRE data-access model, the UK government’s “Trusted Research” guidance is a framework protecting intellectual property and research security in international collaborations, distinct from — but sometimes confused with — the data-governance meaning of “trusted research environment” discussed here.

    What this means for research administrators and funders

    For institutions handling sensitive datasets, FAIR compliance and data protection obligations are no longer competing priorities. A properly governed TRE lets a research office satisfy funder FAIR-data mandates — citing metadata, persistent identifiers, and documented reuse pathways — while meeting UK GDPR, common-law confidentiality, and NHS information-governance duties simultaneously. Research administrators evaluating data-access requests should treat “does this dataset sit behind an accredited TRE with Five Safes controls” as a first-order question, not an afterthought.

    As sub-national secure data environments mature across England’s Integrated Care Systems, and equivalent infrastructure federates across the devolved nations, the interoperability layer — common data models, shared metadata standards, cross-TRE federated analysis — is the area most likely to determine whether the FAIR promise of these environments is fully realised.