Tag: UKRI open data

  • Research Data Management Policy: €10.2bn Case

    A research data management policy that treats FAIR compliance as a line-item cost, rather than a reuse and reputation asset, is the wrong accounting model. PwC estimated in a 2018 study for the European Commission that the absence of FAIR (Findable, Accessible, Interoperable, Reusable) research data costs the European economy at least €10.2 billion a year, largely through duplicated data collection and wasted researcher time. That figure is the strongest evidence available that under-investment in research data management (RDM) infrastructure is a false economy, not a saving.

    A research data management policy is an institutional document setting out the responsibilities of researchers and the institution for planning, storing, securing, sharing and preserving research data across its lifecycle. Most UK universities — Southampton, Birmingham, Manchester, Edinburgh and others — already publish one. The argument here is narrower and more contentious: most are drafted, funded and governed as compliance paperwork, when the evidence says they should be funded as reuse and reputation infrastructure.

    Why RDM policy gets treated as a cost centre

    Institutional budgets typically classify research data management as overhead: storage costs, repository subscriptions, a data steward’s salary, training time. Each appears as a debit with no offsetting credit line, because savings from avoided duplication and faster reuse accrue diffusely, across future researchers and grants, not to the budget holder who paid for the infrastructure.

    This accounting mismatch is compounded by how the data management plan (DMP) requirement is handled in practice. Most funders now mandate one, but research offices frequently treat it as a box-ticking exercise completed at proposal stage and never revisited, rather than a live operational document. That framing under-serves the researcher, who gets no practical reuse benefit, and the institution, which under-recovers the true cost of good RDM from grants that would pay for it.

    UK Research and Innovation (UKRI) explicitly states that costs associated with research data management — storage, curation, repository deposit — are eligible for recovery under its funding. Institutions treating RDM as unfunded overhead are frequently leaving recoverable grant money unclaimed rather than avoiding a cost.

    What the evidence actually says about FAIR and avoided cost

    The FAIR data principles were formalised in 2016 by Wilkinson et al. in Scientific Data as a guide for making digital assets Findable, Accessible, Interoperable and Reusable by both humans and machines. FAIR data is not a compliance checkbox; it is a design standard for making data usable by someone who was not present when it was collected.

    The clearest attributed cost estimate comes from PwC’s 2018 cost-benefit analysis for the European Commission, which put the annual cost of non-FAIR research data to the European economy at €10.2 billion, driven by researcher time lost searching for data, recreation of data that already exists, and lost interdisciplinary reuse. A separate, frequently cited illustration is the University of Minnesota’s decades-long diet study, whose original data nearly disappeared into storage before being recovered and reanalysed — a reminder that data loss is a recurring, avoidable event when retention and documentation are afterthoughts.

    Three mechanisms explain where the savings actually come from:

    • Avoided duplication. Findable, well-described data lets a second researcher build on an existing dataset instead of re-running a costly collection exercise.
    • Faster reuse cycles. Interoperable data in standard formats with persistent identifiers can be integrated into new analyses without reformatting or re-negotiating access.
    • Preserved institutional memory. Deposit in a certified repository protects data against the single most common loss vector: staff turnover and undocumented local storage.

    None of this shows up as a saving on a university’s annual accounts, which is precisely why RDM investment is chronically under-prioritised relative to its documented return.

    How funder compliance requirements are changing the calculus

    Funder mandates are steadily converting FAIR data from voluntary good practice into a hard compliance gate, which changes the institutional risk calculus even for leaders unconvinced by the reuse argument. UKRI’s Common Principles on Research Data, and the underlying Concordat on Open Research Data, require a data management plan for funded research and state that data should be made openly available with as few restrictions as necessary. Horizon Europe applies comparable requirements, and cOAlition S’s Plan S pushes the same expectations into journal-level open-access policy.

    A comparison of how three major funders frame the requirement illustrates the convergence:

    Funder / framework Core RDM requirement FAIR reference
    UKRI Data management plan for funded research; RDM costs eligible for recovery Endorses FAIR via the Concordat on Open Research Data
    Horizon Europe DMP required within six months of project start, updated across lifecycle “As open as possible, as closed as necessary,” explicitly FAIR-aligned
    cOAlition S (Plan S) Underlying data should accompany open-access publications References FAIR principles for supporting data

    Institutions that fund RDM only to the minimum needed for a single grant’s DMP template are exposed twice: to duplicated administrative cost when infrastructure is rebuilt project by project, and to compliance risk as funders move toward auditing DMP adherence rather than merely requiring its submission.

    The case for investing in data stewardship, not just policy text

    A policy document alone does not create FAIR data. That requires people: a data steward function — a dedicated role, a network of disciplinary data champions, or a research data service embedded in the library — able to advise researchers on repository choice, metadata standards and licensing at the point where those decisions are actually made, not after the fact.

    Institutions that fund this role tend to route researchers toward standards-based infrastructure rather than ad hoc local storage: a research data repository registered in re3data.org, ideally holding Core Trust Seal certification, with persistent identifiers (DOIs) and standard metadata attached to every deposit. This is the practical, unglamorous mechanism by which the €10.2 billion estimate above is actually avoided — not through a policy PDF, but through a person and a repository that make FAIR operational.

    CASRAI’s relevance here is provenance and interoperability, not ownership. CASRAI originated the CRediT contributor role taxonomy in 2014, now stewarded by NISO as ANSI/NISO Z39.104-2022 — the same underlying argument in a different domain: standardising who-did-what reduces duplicated verification effort just as standardising data description reduces duplicated data collection. Institutions weighing their research administration infrastructure should treat RDM policy, contributor attribution and open data reuse as one reputational and efficiency system, not separate obligations.

    Answer-first Q&A

    What is a research data management policy?

    A research data management policy is an institutional document defining responsibilities for planning, storing, securing, sharing, and archiving research data across its lifecycle. UK universities including Edinburgh and Manchester publish theirs publicly, typically requiring a data management plan at proposal stage and deposit in an approved repository after project completion.

    What are the FAIR data principles?

    The FAIR data principles — Findable, Accessible, Interoperable, Reusable — were published by Wilkinson et al. in 2016 in Scientific Data as guidance for making digital research assets usable by both humans and machines, through persistent identifiers, standard metadata, and clear licensing.

    Do UK and EU funders require a data management plan?

    Yes. UKRI requires a data management plan for funded research and treats RDM costs as eligible for recovery, while Horizon Europe requires a DMP within six months of project start under its “as open as possible, as closed as necessary” principle.

    How much does poor research data management actually cost?

    PwC’s 2018 analysis for the European Commission put the annual cost of non-FAIR research data to the European economy at €10.2 billion, driven primarily by duplicated data collection and researcher time lost searching for data that already exists elsewhere.

    Implications for institutional leaders

    The practical implication is a reframing exercise, not necessarily a large new budget line. Research offices should cost RDM infrastructure — repositories, data steward time, metadata training — against the funder-eligible recovery already available through DMP-linked grants, rather than absorbing it as unfunded overhead. Leaders reviewing their research data management policy should ask whether it funds a data steward with real authority over repository choice and metadata quality, or whether it is a document that satisfies a compliance checklist and stops there.

    The evidence — a €10.2 billion EU-wide cost estimate, UKRI’s funding eligibility for RDM costs, and Horizon Europe’s escalating DMP requirements — points one direction: institutions that keep treating FAIR compliance as a cost centre are choosing to keep paying the duplication tax FAIR data was designed to eliminate.