Indigenous Data Sovereignty: Why FAIR Needs CARE

Indigenous data sovereignty is the right of Indigenous peoples and nations to govern the collection, ownership, interpretation, and application of data about their own communities, lands, and knowledge. Blanket “open by default” research-data mandates built on the FAIR Data Principles can override that right when they treat findability and accessibility as unconditional. The fix is not to abandon FAIR, but to add a CARE-informed consent layer — tiered access controls, negotiated data-sharing agreements, and governance authority held by the originating community — that sits inside FAIR’s own accessibility principle rather than outside it.

As funders push open-data compliance deeper into grant conditions, research offices increasingly reconcile a mandate to publish with a community’s right to say no, say later, or say “only under these conditions.”

What is indigenous data sovereignty?

Indigenous data sovereignty describes the inherent right of Indigenous peoples to govern data about their own communities, resources, and lands — a right that derives from tribal and national self-determination rather than from any single data-protection statute. The Global Indigenous Data Alliance (GIDA) traces the movement’s institutional roots to country-specific networks: the Aotearoa New Zealand-based Te Mana Raraunga (Māori Data Sovereignty Network, formed 2015), Australia’s Maiam nayri Wingara Aboriginal and Torres Strait Islander Data Sovereignty Collective (2017), Canada’s First Nations Information Governance Centre, and the US Indigenous Data Sovereignty Network.

These networks converged on a shared position: data collected about Indigenous peoples should remain subject to the governance of the nation or community it describes — including tribal law — not solely the policies of the funder, institution, or repository that hosts it. This is a governance claim, not merely a privacy preference, and it applies whether the data in question is health records, environmental monitoring, ceremonial knowledge, or genomic samples.

How do CARE principles relate to FAIR data principles?

The CARE Principles for Indigenous Data Governance — Collective Benefit, Authority to Control, Responsibility, and Ethics — were developed specifically to sit alongside the FAIR Data Principles (Findable, Accessible, Interoperable, Reusable), not to replace them. The Research Data Alliance’s International Indigenous Data Sovereignty Interest Group formalised CARE in 2019 to address what FAIR, on its own, does not: who benefits, who decides, and under what ethical obligations data circulates.

Principle set Primary question it answers Governing focus
FAIR (Findable, Accessible, Interoperable, Reusable) How usable is the data, technically? Data as an object
CARE (Collective Benefit, Authority to Control, Responsibility, Ethics) Who benefits, and who decides? Data as a relationship

Framing these as rivals misreads FAIR’s own text. FAIR principle A1.2 explicitly states that the accessibility protocol must “allow for an authentication and authorisation procedure, where necessary” — meaning FAIR was never a synonym for unconditional open access. Data can be fully findable, with rich metadata, a persistent identifier, and a documented access route, while the underlying content sits behind a governed permission gate. That gap between “discoverable” and “downloadable” is precisely where a CARE-informed consent layer belongs.

Do open data mandates override indigenous data sovereignty?

Open data mandates do not automatically override Indigenous data sovereignty, but poorly designed ones can function that way in practice. Funder policies such as UKRI’s research data policy and cOAlition S’s Plan S commitments require data to be made available with “as open as possible, as restricted as necessary” language — a formulation that already anticipates legitimate restriction, yet is frequently implemented by institutions as a default push toward maximal openness.

PLOS’s own editorial position, published in its EveryONE blog in October 2023, states plainly that Indigenous Data Sovereignty is the right of Indigenous peoples to own and govern data about their communities, resources, and lands — and that open-access publishing policies must accommodate, not override, that right through mechanisms such as data-access statements that explain restrictions rather than force disclosure. The Australian Institute of Aboriginal and Torres Strait Islander Studies (AIATSIS) Code of Ethics for Aboriginal and Torres Strait Islander Research similarly requires researcher agreements on data ownership, access, and storage to be negotiated with communities before collection begins, not retrofitted at publication.

  • Where mandates and sovereignty align: both frameworks require documented data-management plans, clear provenance, and persistent identifiers.
  • Where friction emerges: “open by default” clauses that treat non-disclosure as an exception requiring justification, rather than a governance decision requiring respect.
  • The resolvable middle: metadata and access statements can be fully open even when the underlying dataset is access-controlled.

A consent layer is a set of governance and technical controls — inserted between data creation and data reuse — that lets a community set the terms under which its data is discovered, accessed, and re-used, without removing that data from the research record entirely. In practice this combines four elements research administrators already have tools for:

  1. Tiered metadata: a public, FAIR-compliant record (title, abstract, provenance, persistent identifier via DataCite or Crossref) that is fully findable even when the dataset itself is restricted.
  2. Governance-holder sign-off: a named Indigenous governance body (tribal council, iwi authority, data sovereignty collective) with authority to approve, condition, or decline each reuse request — not a one-time blanket consent captured at initial collection.
  3. A trusted research environment (TRE): a controlled-access computing environment where approved researchers can analyse restricted data without exporting raw records, satisfying reusability without unconditional distribution.
  4. Biocultural or Traditional Knowledge labels: machine-readable metadata tags (the Local Contexts initiative’s TK and BC Labels) that travel with a dataset to signal provenance, cultural protocols, and permitted uses wherever it is indexed or mirrored.

None of these four elements block findability. They condition access — which is exactly what FAIR’s accessible principle already permits.

Data sharing agreement vs data processing agreement — which applies?

A data sharing agreement (DSA) and a data processing agreement (DPA) serve different legal functions, and conflating them is a common source of failure in Indigenous data governance. A DSA governs the transfer of data between two parties who each have independent authority over how it is subsequently used — the correct instrument for Indigenous data sovereignty, because it lets the originating community retain and exercise ongoing authority to control, per CARE’s second principle.

A DPA, by contrast, is used when one party (a processor) handles data strictly on behalf of another (the controller) with no independent decision-making rights — the model built into contract templates under UK GDPR. Using a DPA where a DSA is required strips the originating community of ongoing authority.

Instrument Who holds decision authority Fit for Indigenous data sovereignty
Data Sharing Agreement (DSA) Both parties, independently Appropriate — preserves community authority to control
Data Processing Agreement (DPA) Controller only; processor has none Inappropriate as a standalone instrument — reduces community to data subject

Implications for research administrators

Research data management (RDM) policy templates written purely around funder compliance checklists will systematically under-serve Indigenous data governance unless they build in a consent layer as a standard clause, not an exception process. Institutions should require, at the data-management-plan stage, an explicit question: does this dataset describe an Indigenous community, and if so, has a governance body with authority to control been identified and consulted before collection?

Research data repositories that host Indigenous-derived datasets should support tiered access controls and TK/BC Label metadata natively, rather than treating restricted-access as a bespoke workaround bolted onto an open-by-default platform. Institutions building or procuring a trusted research environment for sensitive data should evaluate whether it can enforce community-set reuse conditions per dataset, not merely per project.

Conclusion: consent is compatible with findability

Indigenous data sovereignty and the FAIR Data Principles are not opposed frameworks competing for the same ground — FAIR governs how data is described and discovered, while CARE and a CARE-informed consent layer govern who decides what happens next. A research data management policy that hard-codes this distinction, uses the right agreement type for the right relationship, and gives Indigenous governance bodies a standing role rather than a one-off consultation, satisfies funder open-data requirements and Indigenous data sovereignty at the same time. The two are compatible by design; the mandates just need to stop assuming otherwise.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *