A data trust is a legal and technical framework in which an independent trustee, bound by fiduciary duty, makes decisions about a pool of data on behalf of the people or organisations who contributed it. For research data, this offers a genuine alternative to depositing datasets individually in a repository: instead of each contributor negotiating access terms alone, a trustee stewards shared data collectively, with accountability built into the governance structure itself.
A data trust can be defined precisely: it is an independent steward, holding data under a formal duty of impartiality, prudence, transparency and undivided loyalty to the beneficiaries whose data it manages, according to the Open Data Institute (ODI), which coined and refined the term from 2018.
- What is a data trust?
- How does a data trust govern research data differently from repository deposit?
- Data sharing agreement vs data processing agreement: where does a data trust fit?
- What does a data trust mean for FAIR data stewardship?
- Indigenous data sovereignty and the CARE Principles
- Answer-first Q&A
- Implications and outlook for research administrators
What is a data trust?
A data trust is a legal structure in which one party authorises an independent trustee to make decisions about data on their behalf, for the benefit of a defined group of stakeholders. The ODI, which published its first explainer on the concept in July 2018 and adopted a working definition later that year, models the idea on established asset trusts such as land trusts, transposing the same fiduciary logic onto data.
The clearest working example is UK Biobank, established in 2006 as a charitable company with trustees to steward genetic data and biological samples from around 500,000 participants. The ODI itself trialled the concept in practice with the UK Government’s Office for AI in April 2019, testing whether fiduciary stewardship could work as applied governance rather than theory alone. Separately, the University of Cambridge’s Data Trusts Initiative has examined data trusts as a mechanism for pooling individuals’ legal data rights into a single negotiating and stewardship entity.
How does a data trust govern research data differently from repository deposit?
Under the standard deposit model, a researcher or institution submits a dataset to a repository, which applies institutional policy and a licence to govern reuse — the repository itself owes no fiduciary duty to depositors. Under a data trust, an independent trustee holds ongoing decision-making authority over the pooled data and is legally obliged to act in the beneficiaries’ interests, not merely to apply a static licence at the point of deposit.
This distinction matters most for sensitive, re-identifiable, or commercially valuable research data, where a one-off licence cannot anticipate every future access request. A trust structure allows collective, ongoing renegotiation of terms as new uses arise, rather than requiring each depositor to individually vet every downstream request.
| Feature | Data trust | Repository deposit |
|---|---|---|
| Legal basis | Formal trust or fiduciary agreement | Institutional policy plus a data licence |
| Decision-maker | Independent trustee(s) with ongoing authority | Depositor sets terms once, at submission |
| Fiduciary duty | Yes — legally binding to beneficiaries | No — repository is a custodian, not a fiduciary |
| Best suited to | Sensitive, re-identifiable, or contested data | Open, low-risk, citation-ready datasets |
Data sharing agreement vs data processing agreement: where does a data trust fit?
A data sharing agreement sets out the terms under which two or more parties exchange data they each control, while a data processing agreement — required under UK GDPR Article 28 wherever a processor handles data on a controller’s behalf — fixes the narrower, instructed relationship between a data controller and a processor acting only on its instructions.
A data trust does not replace either instrument; it changes who holds the authority to agree them. Rather than each institution separately negotiating a data sharing agreement for every new research collaboration, the trustee negotiates and monitors compliance centrally, on behalf of all contributors, reducing duplicated legal effort across a research consortium.
What does a data trust mean for FAIR data stewardship?
The FAIR Principles — Findable, Accessible, Interoperable, Reusable, formalised by Wilkinson and colleagues in Scientific Data in 2016 — govern how research data should be described and made available, but they do not specify who decides access terms. A data trust supplies exactly that missing governance layer.
- Findability and interoperability metadata can still be maintained in a conventional repository even where the trust governs access rights.
- Accessibility becomes a trustee decision rather than a fixed licence, allowing tiered or conditional access for sensitive datasets that would otherwise be withheld entirely.
- Reusability is strengthened where beneficiaries trust the stewardship arrangement enough to contribute richer, less redacted data in the first place.
Institutions bound by research data management policy obligations — including UKRI’s Common Principles on Data Policy — can treat a data trust as a compliance mechanism that satisfies funder access requirements without forcing full open deposit of sensitive material.
Indigenous data sovereignty and the CARE Principles
The Global Indigenous Data Alliance published the CARE Principles — Collective Benefit, Authority to Control, Responsibility, and Ethics — in 2019, explicitly to complement FAIR by centring people and purpose rather than data alone. CARE was developed in direct response to concerns that FAIR-only stewardship could enable extraction of Indigenous data without consent or benefit-sharing.
A data trust structure is one of the few governance mechanisms that can operationalise CARE’s “Authority to Control” principle in practice: it gives a defined community, rather than a repository operator, the standing to appoint trustees and set binding terms. This is a genuinely distinct information-gain point rarely covered in generic data-trust explainers, most of which address corporate or civic data rather than research data sovereignty.
Answer-first Q&A
What is a data trust?
A data trust is a legal and technical structure that manages data on behalf of contributors through an independent trustee. The trustee holds a fiduciary duty — impartiality, prudence, transparency, and undivided loyalty — to the people or organisations whose data is pooled, rather than to any single commercial interest.
What is the data trust structure?
The structure places data under the control of a board of trustees who owe a fiduciary responsibility to the beneficiaries. Terms of access, use, and onward sharing are set collectively and can be renegotiated over time, unlike a fixed licence attached to a single dataset at deposit.
What is a public data trust?
A public data trust is governed by community, government, or non-profit board members committed to widening access to data affecting a defined population. In a research setting, this model supports population studies, public-health cohorts, and civic datasets where public benefit and consent are central governance concerns.
What is the role of a data trustee?
A data trustee manages, protects, and ensures the integrity and appropriate use of pooled data. Trustees identify sensitivity and risk, approve or decline access requests, and enforce the trust’s terms — a standing, ongoing role rather than a one-time licensing decision made at the point of deposit.
Implications and outlook for research administrators
For research administrators, the practical implication is that data trusts are not a substitute for repository infrastructure — findability, persistent identifiers, and metadata still depend on conventional deposit systems. What a trust adds is a governance layer above the infrastructure, suited to consortium data, population cohorts, and datasets involving Indigenous or otherwise sovereignty-sensitive communities.
Institutions weighing a data trust model should expect higher upfront legal cost than a standard repository licence, offset against lower recurring negotiation cost across a multi-year, multi-partner project. As FAIR-compliant infrastructure matures and CARE-aligned governance expectations grow, data trusts are likely to remain a minority but increasingly cited option for exactly the categories of research data — sensitive, collectively owned, or community-governed — that pure open deposit handles least well.