Tag: sensitive data sharing

  • Trusted Research Environments Make NHS Data FAIR

    A trusted research environment (TRE) is a secure, access-controlled computing platform that lets approved researchers analyse sensitive data — such as NHS patient records — without ever copying, downloading, or exporting the underlying data. Analysts log in remotely, run their code against the data inside the environment, and only pre-checked, aggregated outputs leave the boundary. This is the mechanism that lets sensitive health datasets stay FAIR-findable and reusable while the data itself never crosses a governance line.

    A trusted research environment is: a governed digital space in which pre-approved researchers query sensitive data under the Five Safes framework, with disclosure-checked outputs as the only route out. TREs are also known as secure data environments (SDEs), data safe havens, or secure research environments (SREs) — functionally synonymous terms, though NHS England now prefers “secure data environment” in public-facing policy as more intuitive than the technical “TRE”.

    What is a trusted research environment and how does it work?

    A TRE inverts the traditional data-sharing model. Instead of sending a dataset to a researcher’s own machine, the researcher comes to the data. Code, statistical software, and disclosure-controlled outputs move; identifiable records do not.

    • No data extraction: raw records cannot be downloaded, copied, or emailed out of the environment.
    • Pre-installed analytical tooling: statistical packages and secure workspaces sit inside the perimeter, so researchers never need a local copy.
    • Output checking: a human or automated disclosure-control review screens every result before it is released, to confirm no individual can be re-identified.

    Peer-reviewed literature describes a TRE as “an environment supported by trained staff and agreed processes… to access sensitive data” — a framing echoed across UK academic TRE documentation.

    What is the Five Safes framework?

    The Five Safes framework is the governance model almost every UK TRE uses to structure access decisions — from the Office for National Statistics’ Secure Research Service to NHS regional secure data environments. It originated at the ONS and is now standard across the UK’s public-sector research data infrastructure.

    Safe Question it answers Typical control
    Safe Projects Is the research in the public interest? Independent research/ethics review of the proposal
    Safe People Can this researcher be trusted? Accreditation, training, background checks
    Safe Settings Is the technical environment secure? No internet egress, monitored virtual desktops, audit logging
    Safe Data Is the data adequately de-identified? Pseudonymisation, aggregation, statistical disclosure control
    Safe Outputs Could the results re-identify anyone? Manual or automated output review before release

    ADR UK notes that each of its national partnerships, as well as the ONS, operates a dedicated TRE built on Five Safes principles — the de facto standard, not one option among several.

    How does the NHS secure data environment programme work?

    NHS England’s SDE policy requires that access to NHS health and social care data for research and planning be provided through accredited secure data environments, rather than by disseminating extracted, pseudonymised datasets to individual organisations. This followed the 2022 Goldacre Review, “Better, Broader, Safer: Using Health Data for Research and Analysis,” which recommended TREs become the default route for accessing NHS data rather than the exception.

    The result is a two-tier structure now operating across England:

    • NHS England’s national SDE, holding national datasets for approved research uses.
    • Sub-national secure data environments (SNSDEs), regional environments aligned to Integrated Care Systems, giving researchers access to more granular, regionally linked data.

    Devolved nations run equivalent infrastructure: the Scottish National Safe Haven, Wales’ SAIL Databank at Swansea University, and Northern Ireland’s Honest Broker Service each function as a jurisdictional TRE under comparable governance.

    How do TREs make sensitive data FAIR without moving it?

    The FAIR data principles — Findable, Accessible, Interoperable, Reusable, formalised by Wilkinson et al. in Scientific Data (2016) — were designed for open datasets that can be freely retrieved. Sensitive health data cannot satisfy FAIR in its literal, open-access sense; a TRE lets each principle apply to the metadata and governance layer instead of the raw record. This is the architectural insight most explainer content on this topic misses: FAIR does not require open data, it requires a documented, machine-actionable pathway to reuse — and a TRE supplies exactly that for data which must stay closed.

    • Findable: TREs publish dataset-level metadata in public catalogues — for example, the HDR UK Innovation Gateway — with persistent identifiers, so a dataset’s existence, structure, and provenance are discoverable even though the records inside are never exposed.
    • Accessible: “accessible” is redefined as a documented, auditable application and accreditation process (Safe People, Safe Projects) rather than an open download link — the process itself is transparent even where the data is not.
    • Interoperable: common data models and coding standards (such as OMOP or SNOMED CT mappings used across NHS TREs) let approved analyses run consistently across multiple environments, enabling federated analysis without pooling raw data in one place.
    • Reusable: version-controlled analytical code, output logs, and data dictionaries are retained and, increasingly, shared openly by researchers even when the underlying data cannot be — supporting reproducibility and future reuse of the method, if not the dataset.

    This mapping is the load-bearing argument of the TRE model: sensitive data sharing and open FAIR data are not opposites. The TRE is the governance boundary that lets FAIR’s discovery and reuse guarantees operate at the metadata and code layer while Five Safes controls operate at the record layer.

    How does OpenSAFELY demonstrate the model in practice?

    OpenSAFELY, built by researchers at the University of Oxford and the Bennett Institute for Applied Data Science in response to the COVID-19 pandemic, is the most cited working example of this architecture. Rather than extracting GP records, OpenSAFELY runs analytical code inside the secure environments of the electronic health record software suppliers themselves, executing studies against the pseudonymised primary-care record for a very large proportion of England’s registered patients — without the data ever leaving NHS-contracted infrastructure.

    Its methods and code repositories are published openly, so the analytical logic is fully FAIR — reusable and auditable by anyone — even though the patient-level data it runs against never is. That split is the clearest public demonstration of “FAIR governance, closed data” in UK health research.

    Common questions about trusted research environments

    What is the difference between an SDE and a TRE?

    An SDE and a TRE describe the same underlying architecture; SDE is the term NHS England now favours as clearer for non-specialist audiences, while TRE remains standard in academic and technical documentation, including workspace-level “research TREs” built for a single project inside a broader SDE.

    Is a data safe haven the same as a trusted research environment?

    Yes — data safe haven is an earlier, still widely used UK term for the same model, applied to environments such as the Scottish National Safe Haven. All three terms describe a controlled computing space governed by comparable accreditation, de-identification, and output-checking controls, typically under a Five Safes-style framework.

    What is required to build a trusted research environment?

    Building a compliant TRE requires an on-premises or cloud-hosted secure computing platform with no unmonitored internet egress, encrypted data at rest and in transit, role-based access controls, and a formal output-checking process — King’s College London’s CREATE TRE, for example, operates under ISO 27001 certification to evidence these controls externally.

    What does “trusted research” mean in UK government usage?

    Separately from the TRE data-access model, the UK government’s “Trusted Research” guidance is a framework protecting intellectual property and research security in international collaborations, distinct from — but sometimes confused with — the data-governance meaning of “trusted research environment” discussed here.

    What this means for research administrators and funders

    For institutions handling sensitive datasets, FAIR compliance and data protection obligations are no longer competing priorities. A properly governed TRE lets a research office satisfy funder FAIR-data mandates — citing metadata, persistent identifiers, and documented reuse pathways — while meeting UK GDPR, common-law confidentiality, and NHS information-governance duties simultaneously. Research administrators evaluating data-access requests should treat “does this dataset sit behind an accredited TRE with Five Safes controls” as a first-order question, not an afterthought.

    As sub-national secure data environments mature across England’s Integrated Care Systems, and equivalent infrastructure federates across the devolved nations, the interoperability layer — common data models, shared metadata standards, cross-TRE federated analysis — is the area most likely to determine whether the FAIR promise of these environments is fully realised.