Tag: cihr data management plan

  • Research Data Management Policy: €10.2bn Case

    A research data management policy that treats FAIR compliance as a line-item cost, rather than a reuse and reputation asset, is the wrong accounting model. PwC estimated in a 2018 study for the European Commission that the absence of FAIR (Findable, Accessible, Interoperable, Reusable) research data costs the European economy at least €10.2 billion a year, largely through duplicated data collection and wasted researcher time. That figure is the strongest evidence available that under-investment in research data management (RDM) infrastructure is a false economy, not a saving.

    A research data management policy is an institutional document setting out the responsibilities of researchers and the institution for planning, storing, securing, sharing and preserving research data across its lifecycle. Most UK universities — Southampton, Birmingham, Manchester, Edinburgh and others — already publish one. The argument here is narrower and more contentious: most are drafted, funded and governed as compliance paperwork, when the evidence says they should be funded as reuse and reputation infrastructure.

    Why RDM policy gets treated as a cost centre

    Institutional budgets typically classify research data management as overhead: storage costs, repository subscriptions, a data steward’s salary, training time. Each appears as a debit with no offsetting credit line, because savings from avoided duplication and faster reuse accrue diffusely, across future researchers and grants, not to the budget holder who paid for the infrastructure.

    This accounting mismatch is compounded by how the data management plan (DMP) requirement is handled in practice. Most funders now mandate one, but research offices frequently treat it as a box-ticking exercise completed at proposal stage and never revisited, rather than a live operational document. That framing under-serves the researcher, who gets no practical reuse benefit, and the institution, which under-recovers the true cost of good RDM from grants that would pay for it.

    UK Research and Innovation (UKRI) explicitly states that costs associated with research data management — storage, curation, repository deposit — are eligible for recovery under its funding. Institutions treating RDM as unfunded overhead are frequently leaving recoverable grant money unclaimed rather than avoiding a cost.

    What the evidence actually says about FAIR and avoided cost

    The FAIR data principles were formalised in 2016 by Wilkinson et al. in Scientific Data as a guide for making digital assets Findable, Accessible, Interoperable and Reusable by both humans and machines. FAIR data is not a compliance checkbox; it is a design standard for making data usable by someone who was not present when it was collected.

    The clearest attributed cost estimate comes from PwC’s 2018 cost-benefit analysis for the European Commission, which put the annual cost of non-FAIR research data to the European economy at €10.2 billion, driven by researcher time lost searching for data, recreation of data that already exists, and lost interdisciplinary reuse. A separate, frequently cited illustration is the University of Minnesota’s decades-long diet study, whose original data nearly disappeared into storage before being recovered and reanalysed — a reminder that data loss is a recurring, avoidable event when retention and documentation are afterthoughts.

    Three mechanisms explain where the savings actually come from:

    • Avoided duplication. Findable, well-described data lets a second researcher build on an existing dataset instead of re-running a costly collection exercise.
    • Faster reuse cycles. Interoperable data in standard formats with persistent identifiers can be integrated into new analyses without reformatting or re-negotiating access.
    • Preserved institutional memory. Deposit in a certified repository protects data against the single most common loss vector: staff turnover and undocumented local storage.

    None of this shows up as a saving on a university’s annual accounts, which is precisely why RDM investment is chronically under-prioritised relative to its documented return.

    How funder compliance requirements are changing the calculus

    Funder mandates are steadily converting FAIR data from voluntary good practice into a hard compliance gate, which changes the institutional risk calculus even for leaders unconvinced by the reuse argument. UKRI’s Common Principles on Research Data, and the underlying Concordat on Open Research Data, require a data management plan for funded research and state that data should be made openly available with as few restrictions as necessary. Horizon Europe applies comparable requirements, and cOAlition S’s Plan S pushes the same expectations into journal-level open-access policy.

    A comparison of how three major funders frame the requirement illustrates the convergence:

    Funder / framework Core RDM requirement FAIR reference
    UKRI Data management plan for funded research; RDM costs eligible for recovery Endorses FAIR via the Concordat on Open Research Data
    Horizon Europe DMP required within six months of project start, updated across lifecycle “As open as possible, as closed as necessary,” explicitly FAIR-aligned
    cOAlition S (Plan S) Underlying data should accompany open-access publications References FAIR principles for supporting data

    Institutions that fund RDM only to the minimum needed for a single grant’s DMP template are exposed twice: to duplicated administrative cost when infrastructure is rebuilt project by project, and to compliance risk as funders move toward auditing DMP adherence rather than merely requiring its submission.

    The case for investing in data stewardship, not just policy text

    A policy document alone does not create FAIR data. That requires people: a data steward function — a dedicated role, a network of disciplinary data champions, or a research data service embedded in the library — able to advise researchers on repository choice, metadata standards and licensing at the point where those decisions are actually made, not after the fact.

    Institutions that fund this role tend to route researchers toward standards-based infrastructure rather than ad hoc local storage: a research data repository registered in re3data.org, ideally holding Core Trust Seal certification, with persistent identifiers (DOIs) and standard metadata attached to every deposit. This is the practical, unglamorous mechanism by which the €10.2 billion estimate above is actually avoided — not through a policy PDF, but through a person and a repository that make FAIR operational.

    CASRAI’s relevance here is provenance and interoperability, not ownership. CASRAI originated the CRediT contributor role taxonomy in 2014, now stewarded by NISO as ANSI/NISO Z39.104-2022 — the same underlying argument in a different domain: standardising who-did-what reduces duplicated verification effort just as standardising data description reduces duplicated data collection. Institutions weighing their research administration infrastructure should treat RDM policy, contributor attribution and open data reuse as one reputational and efficiency system, not separate obligations.

    Answer-first Q&A

    What is a research data management policy?

    A research data management policy is an institutional document defining responsibilities for planning, storing, securing, sharing, and archiving research data across its lifecycle. UK universities including Edinburgh and Manchester publish theirs publicly, typically requiring a data management plan at proposal stage and deposit in an approved repository after project completion.

    What are the FAIR data principles?

    The FAIR data principles — Findable, Accessible, Interoperable, Reusable — were published by Wilkinson et al. in 2016 in Scientific Data as guidance for making digital research assets usable by both humans and machines, through persistent identifiers, standard metadata, and clear licensing.

    Do UK and EU funders require a data management plan?

    Yes. UKRI requires a data management plan for funded research and treats RDM costs as eligible for recovery, while Horizon Europe requires a DMP within six months of project start under its “as open as possible, as closed as necessary” principle.

    How much does poor research data management actually cost?

    PwC’s 2018 analysis for the European Commission put the annual cost of non-FAIR research data to the European economy at €10.2 billion, driven primarily by duplicated data collection and researcher time lost searching for data that already exists elsewhere.

    Implications for institutional leaders

    The practical implication is a reframing exercise, not necessarily a large new budget line. Research offices should cost RDM infrastructure — repositories, data steward time, metadata training — against the funder-eligible recovery already available through DMP-linked grants, rather than absorbing it as unfunded overhead. Leaders reviewing their research data management policy should ask whether it funds a data steward with real authority over repository choice and metadata quality, or whether it is a document that satisfies a compliance checklist and stops there.

    The evidence — a €10.2 billion EU-wide cost estimate, UKRI’s funding eligibility for RDM costs, and Horizon Europe’s escalating DMP requirements — points one direction: institutions that keep treating FAIR compliance as a cost centre are choosing to keep paying the duplication tax FAIR data was designed to eliminate.

  • National Data Repository Mandates: UK, US, EU

    National data repository requirements now differ sharply by jurisdiction: the UK coordinates through UKRI’s Concordat on Open Research Data and a planned National Data Library, the US relies on agency-specific mandates such as the NIH Data Management and Sharing Policy layered on the OPEN Government Data Act, and the EU binds Horizon Europe funding to mandatory FAIR data management plans routed through the European Open Science Cloud. All three converge on the FAIR principles as the technical baseline, but they diverge sharply on enforcement, centralisation and what “as open as possible” means in practice.

    A national data repository is a government- or funder-endorsed infrastructure (or federated network of infrastructures) for depositing, curating and providing persistent access to datasets produced by publicly funded research, so that they meet the FAIR standard of being Findable, Accessible, Interoperable and Reusable. No single global rulebook defines what such a repository must look like — which is precisely why the UK, US and EU have built three structurally different systems around the same FAIR foundation.

    What counts as a national data repository?

    A national data repository is infrastructure, endorsed at government or funder level, that stores research datasets with persistent identifiers, standardised metadata and defined reuse licences. The FAIR data principles — first formalised in Scientific Data in 2016 — define the technical bar: data and metadata must be findable via persistent identifiers, accessible over open protocols, interoperable through shared vocabularies, and reusable under clear provenance and licensing.

    Crucially, FAIR does not mean unconditionally open. The dominant policy language across all three jurisdictions is some variant of “as open as possible, as closed as necessary” — datasets with legitimate privacy, security or intellectual-property constraints can remain FAIR while access to the raw data itself stays restricted, provided the metadata is still discoverable.

    How does the UK mandate research data repositories?

    The UK’s approach is coordinated centrally through UK Research and Innovation (UKRI) rather than fragmented across individual funders. The Concordat on Open Research Data, agreed by UK funders and sector bodies, sets the expectation that publicly funded research data should be made openly available with as few restrictions as possible, in a timely and responsible manner.

    UKRI has been developing a harmonised open research data policy to replace the varying requirements previously set by its individual research councils, with a more explicit alignment to FAIR principles than the original Concordat text. The UK does not run one single mandatory repository for all disciplines; instead it combines a cross-disciplinary resource — the UK Data Service, holding the country’s largest collection of economic, population and social research data — with discipline-specific data centres. A National Data Library initiative is also under development. Enforcement runs through grant conditions rather than statute.

    How does the US enforce data-sharing requirements?

    The US combines a government-wide legal baseline with agency-specific enforcement, producing a federated rather than centralised system. The OPEN Government Data Act codifies the principle that federal government data — including federally funded research outputs captured by agencies — should be open and machine-readable by default, operationalised through the Data.gov catalogue.

    The sharpest enforcement sits with individual funding agencies. Under the NIH Data Management and Sharing (DMS) Policy, effective since January 2023, NIH-funded researchers must submit a DMS Plan describing how scientific data will be managed and shared, with FAIR principles strongly encouraged. The National Science Foundation requires a Data Management Plan for all proposals and supports deposit through disciplinary repositories and its own NSF Public Access Repository (NSF-PAR). This gives communities flexibility to choose fitting repositories, at the cost of one unified national research-data repository.

    How does the EU mandate FAIR data through Horizon Europe?

    The EU operates the most centrally binding framework of the three. The Directive on open data and the re-use of public sector information requires member states to establish national policies for open access to publicly funded research data on an “open by default” basis, explicitly aligned with FAIR principles. For research funded under Horizon Europe, making data FAIR is a mandatory grant condition, not a recommendation: funded projects must produce a Data Management Plan and comply with FAIR requirements as a condition of the award, under the same “as open as possible, as closed as necessary” test used elsewhere.

    Infrastructure is built around the European Open Science Cloud (EOSC), described by the European Commission as a federated environment intended to become a “web of FAIR data and services” spanning all scientific disciplines. Within that federation, researchers commonly deposit through the general-purpose repository Zenodo — built and operated with CERN — while the Community Research and Development Information Service (CORDIS) serves as the EU’s public repository of record for funded project information.

    Where do the three approaches converge and diverge?

    All three jurisdictions treat FAIR as the technical baseline and all three qualify openness with a “necessary restriction” clause. The differences lie in enforcement mechanism, degree of centralisation, and whether a single flagship repository exists.

    Feature UK US EU
    Primary instrument UKRI Concordat on Open Research Data (evolving to a harmonised FAIR-explicit policy) OPEN Government Data Act; NIH DMS Policy; NSF Public Access Policy EU Open Data Directive; Horizon Europe grant conditions
    Legal basis Funder policy condition Federal statute plus agency policy Legally binding directive plus grant condition
    FAIR status Increasingly explicit in new UKRI policy Encouraged, embedded in agency plans Mandatory for Horizon Europe-funded projects
    Data management plan required Yes, for UKRI funding Yes, for NIH and NSF funding Yes, mandatory for Horizon Europe
    Repository model Centralised flagship (UK Data Service) plus disciplinary centres Federated (Data.gov, NSF-PAR, disciplinary repositories) Federated supranational (EOSC, Zenodo, CORDIS)

    Common questions on national data repository mandates

    What are the FAIR data principles required by UKRI?

    UKRI requires funded researchers to make outputs Findable, Accessible, Interoperable and Reusable, aligned with its Concordat on Open Research Data. UKRI councils frame this as maximising the impact, visibility and citation of research while applying the “as open as possible, as restricted as necessary” test to data with legitimate sensitivities.

    Does the NIH require a data management and sharing plan?

    Yes. Since 25 January 2023, the NIH Data Management and Sharing (DMS) Policy requires funded researchers to submit a DMS Plan describing how scientific data will be preserved and shared. NIH strongly encourages applying FAIR principles when selecting repositories and structuring metadata for that plan.

    Is FAIR data mandatory under Horizon Europe?

    Yes, unlike the UK’s evolving policy and the US’s encouraged-but-agency-specific approach, Horizon Europe makes FAIR data management a binding grant condition. Funded projects must submit a Data Management Plan and comply with FAIR requirements, subject to the same necessary-restriction exceptions used across all three jurisdictions.

    Is there one single national data repository researchers must use?

    No jurisdiction mandates a single universal repository. The UK combines a flagship service (UK Data Service) with disciplinary centres; the US runs a federated system across Data.gov and agency repositories such as NSF-PAR; the EU federates access through EOSC, Zenodo and CORDIS. Researchers typically choose the repository matching their discipline and funder requirements.

    What this means for institutions and researchers

    For research administrators managing multi-jurisdictional funding, a single data management plan template cannot satisfy all three regimes. Compliance teams must map deposit requirements per funder rather than assume FAIR-labelled data automatically meets every mandate’s specific repository, licensing and metadata conditions.

    The trend line points toward convergence. The UK’s move to a harmonised, more explicitly FAIR-aligned UKRI policy and the EU’s EOSC federation both signal a shift from fragmented rules toward unified infrastructure. The US remains the outlier: its federal open-data statute operates largely independently of agency-specific mandates from NIH and NSF.

    Institutions should treat “FAIR” and “open” as related but distinct compliance targets. A dataset can be fully FAIR — persistently identified, well-described, licensed — while remaining access-restricted for legitimate reasons in every jurisdiction covered here. Repository choice and data management plan content should be checked against the specific funder mandate, not a generic FAIR checklist.

  • Research Data Manager Job Description, Skills and Career Path

    A research data manager plans, organises and safeguards the data a research project produces — from collection through documentation, storage, sharing and long-term archiving — and is distinct from a data steward (governance-focused) or a research administrator (grants and compliance-focused). The role sits at the intersection of research support, information management and IT, typically inside a university’s library, research office or a funded project team.

    This guide sets out the research data manager job description, the skills and qualifications employers ask for, how the role differs from adjacent titles, and the realistic career path from entry-level data support through to strategic data leadership.

    What is a research data manager?

    A research data manager is the named individual responsible for a project’s or department’s data management plan, metadata standards and repository deposits. The role exists because funders increasingly require a documented, reusable dataset alongside every publication, not just the paper itself.

    The task is not new — it maps closely to the Data Curation contributor role in the CRediT taxonomy, defined as “management activity to annotate, scrub data and maintain research data for initial use and later re-use.” CASRAI originated the CRediT contributor role taxonomy in 2014; the standard is now stewarded by NISO as ANSI/NISO Z39.104-2022, and Data Curation remains one of its 14 defined roles — evidence that the function research data managers perform has been formally recognised in scholarly attribution for over a decade.

    What does a research data manager do day to day?

    Day-to-day work centres on making a project’s data findable, well-documented and safely stored, then repeatable for the next study. Typical duties, drawn from published UK university and NHS job descriptions, include:

    • Drafting and reviewing data management plans (DMPs) for grant applications
    • Setting up and maintaining databases, spreadsheets and case report forms for a study
    • Applying metadata standards so datasets are discoverable in institutional or subject repositories
    • Coordinating deposit of datasets with DataCite-registered DOIs for citation and reuse
    • Running data quality checks, version control and access permissions across a research team
    • Training researchers and doctoral students in good data management practice
    • Advising on compliance with funder data policies and data protection legislation

    Research data manager vs data steward vs research administrator

    These three titles are frequently confused in job adverts because responsibilities overlap, but their primary focus and reporting line differ. The table below distinguishes the three roles as they typically appear in UK higher education and research institutions.

    Dimension Research Data Manager Data Steward Research Administrator
    Primary focus Lifecycle management of a specific project’s or department’s datasets Institution-wide data governance, quality rules and ownership policy Grant administration, compliance and researcher support
    Typical base Research office, library or funded project team IT services, information governance or central data office Research office, faculty or funder-facing team
    Core output Data management plans, metadata, repository deposits Data policies, classification schemes, access controls Grant applications, contracts, financial and ethics reporting
    Professional body Often affiliated with library/data-curation networks Information governance and data protection networks ARMA (UK/Ireland), EARMA, INORMS, NCURA
    Typical entry route Data science, library/information studies, life sciences degree IT governance, information management background Any discipline plus research administration training

    What skills, qualifications and training are required?

    Employers combine technical data skills with domain and communication skills, since the role requires translating funder and disciplinary requirements into practical workflows researchers will actually follow.

    • Data handling: spreadsheet and database competence; SQL, Python or R are increasingly listed as desirable
    • Standards knowledge: metadata schemas, DataCite, ORCID identifiers, and repository deposit workflows
    • Policy literacy: UK GDPR, funder data policies, and institutional research governance frameworks
    • Communication: training researchers, writing plain-English guidance, negotiating with study sponsors
    • Project management: running parallel studies to funder deadlines with limited resource

    Formal training routes include postgraduate qualifications in library and information science or data science, plus shorter dedicated courses. The Digital Curation Centre (DCC), funded by Jisc, has provided UK universities with research data management guidance and training resources since 2004 and remains the primary UK reference point for RDM practice. Institutional RDM obligations trace back to funder policy: EPSRC’s research data expectations, effective from 1 May 2015, require UK institutions receiving its funding to publish a research data management policy and a roadmap for compliance. The 2016 Concordat on Open Research Data — jointly published by Research Councils UK, Universities UK, Wellcome Trust and HEFCE — set out ten principles establishing that data management planning should be integral to research design, reinforcing why institutions now hire dedicated staff for this function rather than leaving it to individual researchers.

    What is the typical career path and salary range?

    Entry typically begins in a data assistant or data curator post supporting a research team’s day-to-day data handling, often on a fixed-term contract tied to a specific study. Real UK job postings illustrate the entry tier clearly: an NHS Research Data Manager post advertised in May 2025 by Midlands Partnership NHS Foundation Trust was graded at Agenda for Change Band 4, with a salary of £26,530 to £29,114 a year.

    Progression moves through Research Data Manager (owning DMPs and repository workflows for a department or portfolio of studies) to Senior/Lead Research Data Manager, where the postholder sets institutional RDM policy and may supervise a small team. The most senior tier — Director of Research Data Services or equivalent — sets strategic direction for an institution’s entire research data infrastructure and reports into the research office or library leadership. Unlike research administration, a PhD is not a standard requirement at any tier, though it is common among staff who progress from a research role into data management.

    Common questions about the role

    What are the responsibilities of a data manager?

    A data manager is responsible for the entire data lifecycle: collection, quality control, storage, security, documentation and eventual archiving or disposal. In a research context this extends to writing data management plans, applying metadata standards, and coordinating repository deposit so datasets remain reusable after a project ends.

    What does a research data manager do?

    A research data manager develops and implements the policies, workflows and documentation that keep a project’s or department’s datasets organised, secure and discoverable. Duties include drafting data management plans, training researchers, running quality checks, and depositing data with persistent identifiers such as DataCite DOIs for citation and reuse.

    What is the salary of a data manager?

    Salaries vary widely by sector and seniority. A UK NHS-graded entry-level research data manager post advertised in 2025 sat at Agenda for Change Band 4, paying £26,530–£29,114 a year; senior and director-level research data roles in universities and industry command substantially higher salaries, reflecting added strategic and line-management responsibility.

    What are the 4 types of research data?

    Research data is commonly grouped into primary data (collected directly for the study), secondary data (reused from existing sources), and quantitative versus qualitative data by format. A research data manager must apply appropriate metadata, storage and sharing rules to each type, since funder and ethical requirements differ across them.

    What this means for institutions and job seekers

    For institutions, the job description confusion between research data manager, data steward and research administrator is itself a risk: unclear scoping leads to duplicated effort or gaps in funder compliance. Writing role descriptions that reference recognised frameworks — the CRediT Data Curation role, DCC guidance, and funder RDM policy — gives hiring managers a defensible, standards-aligned specification rather than an ad hoc list of duties.

    For job seekers, the clearest differentiator to lead with on an application is lifecycle ownership of data, not general IT or administrative competence. As funders continue tightening open-data mandates, demand for staff who can demonstrate metadata standards knowledge, repository deposit experience and DMP authorship is likely to keep outpacing supply, making this one of the more durable specialisms within the broader research administration and support ecosystem.

    For related roles and standards context, see CASRAI’s CRediT contributor roles hub, the research administration dictionary, and the research administration pillar.

  • UK Data Service vs ICPSR: Choosing an Archive

    The UK Data Service and ICPSR are the two largest social-science data archives in the English-speaking research world, and the right choice usually depends on jurisdiction and funder mandate rather than feature parity. The UK Data Service is the ESRC-funded national repository for UK social, economic and population data, while ICPSR is a US-based, membership-funded consortium archive at the University of Michigan. Researchers outside the biomedical repository ecosystem — where PubMed-linked mandates dominate — need to weigh deposit workflow, restricted-access tiers and citation practice before picking either as a home for a dataset.

    The UK Data Service is the largest digital repository for quantitative and qualitative social science and humanities research data in the United Kingdom, formed in October 2012 when the Economic and Social Research Council (ESRC) consolidated the UK Data Archive — established at the University of Essex in 1967 — with several university partners. ICPSR, by contrast, is a membership consortium of academic and research institutions that has archived social and behavioural science data since 1962. Both are listed in re3data.org, the global Registry of Research Data Repositories, and both hold CoreTrustSeal certification for trustworthy digital repositories.

    What Are the UK Data Service and ICPSR?

    The UK Data Service is a national data repository funded through UKRI’s Economic and Social Research Council (ESRC) and led by the UK Data Archive at the University of Essex, in partnership with the University of Manchester, Jisc, EDINA and University College London. It holds more than 6,000 datasets, including UK Census data, the Labour Force Survey, the Millennium Cohort Study and cross-national surveys such as the European Social Survey.

    ICPSR — the Inter-university Consortium for Political and Social Research — is a membership-funded archive based at the University of Michigan, serving several hundred member institutions worldwide alongside non-member depositors and users. Its holdings span large-scale US and international surveys, criminal justice, education and ageing data, and it runs openICPSR as a self-publishing companion repository for rapid dissemination.

    How Do Deposit Workflows Compare?

    Both archives run a curated deposit model rather than a bare-metal upload box: staff review documentation, check disclosure risk and enhance metadata before release. The UK Data Service’s ESRC funding creates a contractual hook — grant holders are required to offer their data for archiving as a condition of the ESRC Research Data Policy — which ICPSR’s membership model does not replicate for non-US funders.

    • UK Data Service: two routes — the main curated collection for large, complex or sensitive studies, and ReShare, a lighter self-deposit repository for smaller datasets, code and syntax files.
    • ICPSR: two routes — the standard curated deposit process, and openICPSR, a self-publishing repository for researchers who want faster turnaround with lighter-touch review.

    Depositors submitting to either service should expect a documentation checklist covering variable-level metadata, consent and ethics evidence, and a data management plan — the same categories UKRI and NSF grant terms typically require regardless of which archive receives the deposit.

    How Do Restricted-Access Tiers Differ?

    Access tiering is where the two services diverge most for researchers working with confidential or disclosive social-science data. The UK Data Service operates a published three-tier model; ICPSR uses a comparable but differently named structure built around its Virtual Data Enclave.

    Access dimension UK Data Service ICPSR
    Open tier No registration; Open Government Licence data Public-use files via free MyData account
    Standard tier Safeguarded — registration plus End User Licence Member-institution access under consortium terms
    Restricted tier Controlled — SecureLab, requiring accredited-researcher training under the Five Safes Framework Restricted-use data via secure Virtual Data Enclave or encrypted physical media, subject to a data security plan
    Governance standard Accredited under the Digital Economy Act 2017 by the UK Statistics Authority (2020) Institutional Review Board and data-use-agreement based review

    The UK Data Service’s Five Safes Framework — safe people, projects, settings, data and outputs — was developed with HMRC DataLab and the Office for National Statistics Secure Research Services, and now underpins the SafePod Network launched in 2021 for wider geographical access to sensitive data. ICPSR’s restricted-data pathway achieves an equivalent security outcome through its enclave model but does not use the Five Safes terminology, which matters for UK researchers writing data management plans against ESRC or UKRI templates that reference it explicitly.

    How Do Citation Practices Compare?

    Both archives assign persistent identifiers and expect formal data citation, but their machinery differs. The UK Data Service works with DataCite and the British Library to issue DOIs and promotes an easy-to-use citation tool, framing its approach around the FAIR data principles — Findable, Accessible, Interoperable, Reusable — and its open-source QAMyData tool, which gives depositors a health check for numeric data before release.

    ICPSR similarly issues persistent identifiers for deposited studies and expects citation in publications that reuse its data, but its emphasis sits more on bibliography-style study citations tied to its own numbering system than on a dedicated public FAIR-compliance tool. For researchers publishing in journals that enforce data-availability statements — a growing requirement under funder open-science mandates — the practical difference is smaller than the access-tier gap: both produce a citable, resolvable record, but only the UK Data Service publishes a named QA tool for pre-citation data quality.

    Which Archive Should Researchers Outside Biomedicine Choose?

    For most projects the decision is jurisdictional rather than qualitative. A research data repository choice driven by funder mandate removes ambiguity immediately: ESRC-funded UK researchers must offer data to the UK Data Service, while NSF- or NIH-adjacent US social-science grants more commonly point toward ICPSR or openICPSR.

    • Choose the UK Data Service if your funder is UKRI/ESRC, your data concerns UK administrative, census or longitudinal panel data, or you need SecureLab/Five Safes access to controlled government microdata.
    • Choose ICPSR if your institution is a consortium member, your data is US-focused or cross-national with US partners, or you want the faster openICPSR self-publishing route.
    • Consult both catalogues before depositing internationally comparable survey data (e.g. European Social Survey, Eurobarometer) — coverage overlaps, and the UK Data Service can facilitate UK-based access to ICPSR holdings.

    Institutions building or reviewing a data management plan should treat this as a data repository for research compliance question first and a discoverability question second: a technically excellent dataset deposited in the wrong repository for its funder mandate creates avoidable rework at grant closeout.

    Answer-First Questions Researchers Ask

    What Is the UK Data Service?

    The UK Data Service is the ESRC-funded national repository for UK economic, population and social research data, led by the UK Data Archive at the University of Essex. It holds over 6,000 datasets, including census, survey and longitudinal study data, and operates under the OAIS digital-preservation reference model.

    How Do You Access Data on the UK Data Service?

    Access runs through three published tiers: Open data requiring no registration, Safeguarded data requiring registration and an End User Licence, and Controlled data requiring SecureLab accreditation under the Five Safes Framework. Most researchers start with the free data catalogue and register once they identify a specific study.

    Is the UK Data Service Free?

    Yes — the service is free to data owners depositing studies and free at the point of use for non-commercial research and teaching. Commercial users may incur administrative fees, and controlled-tier access requires accredited-researcher training rather than a monetary charge.

    Implications for Research Administrators

    Data management plans reviewed by institutional research offices, ARMA and INORMS-aligned research administrators, and funder compliance teams increasingly treat repository choice as an auditable field, not a footnote. A UK-funded study archived outside the UK Data Service without documented justification can trigger ESRC compliance queries at final reporting; a US consortium study left undeposited with ICPSR can weaken an institution’s case for renewed membership funding. Neither archive competes with domain-specific biomedical repositories governed by NISO, ICMJE or COPE norms — this comparison sits squarely in the national data repository space for social science, distinct from that ecosystem.

    As open-science mandates from UKRI, cOAlition S and equivalent US funders converge on FAIR-by-default expectations, the operational gap between the UK Data Service and ICPSR is narrowing to jurisdiction, access-tier terminology and citation tooling rather than underlying trustworthiness — both hold CoreTrustSeal certification and both sit inside the CESSDA/re3data recognised-repository landscape that funders now check by default.

  • Research Data Management Policy: Not Just a DMP

    A research data management policy is an institution-wide governance document that sets ownership, retention, storage and researcher-responsibility rules for all research data an organisation produces — distinct from a data management plan (DMP), which is a project-specific document written for a single grant. Confusing the two leaves institutions with fragmented practice: strong per-grant DMPs but no consistent rule for what happens to data once a project, or a researcher, moves on.

    A research data management policy is the institutional framework; the DMP is one project’s implementation of it. This article sets out the structural difference and gives a template for writing the institutional-level document, covering ownership, retention tiers, storage classes and researcher obligations.

    What is a research data management policy?

    A research data management (RDM) policy is a formally approved institutional document — typically ratified by a university executive, senate or research committee — that defines how all research data created, collected or reused at that institution must be handled across its lifecycle: creation, active use, retention, sharing and disposal.

    Unlike guidance notes or web pages, a policy carries institutional authority: it assigns accountability, sets minimum retention periods, and states what happens by default when a researcher leaves or a grant closes. The UKRI Concordat on Open Research Data (2016, updated 2020), signed by UK Research and Innovation, Universities UK and the Wellcome Trust among others, sets out common principles — including that research data are a public good and that costs of good data management are legitimate, fundable research costs. Most UK institutional RDM policies, including those at Edinburgh, Southampton and Manchester, cite the Concordat directly as their basis.

    Research data management policy vs a data management plan

    The policy and the DMP operate at different scopes and answer different questions. The policy answers “what does this institution require of everyone, always?” The DMP answers “how will this specific project handle its specific data?” A DMP written for a UKRI or Horizon Europe grant should reference and comply with the institutional policy, not substitute for it.

    Dimension Institutional RDM policy Data management plan (DMP)
    Scope Whole institution, all research Single project or grant
    Author Research office, library, IT, governance committee Principal investigator / research team
    Trigger Approved once, reviewed periodically Written at proposal stage, revised through project life
    Contains Ownership defaults, retention minimums, storage tiers, roles Dataset types, volumes, specific repositories, embargo dates
    Enforcement Institutional compliance / disciplinary framework Funder compliance check at reporting/audit
    Review cycle Every 3-5 years (Edinburgh’s policy specifies five) Reviewed and updated within the life of one project

    A well-run institution needs both, in that order: the policy first, so every subsequent DMP inherits a consistent set of defaults — retention minimums, approved repositories, data protection procedures — rather than each research team inventing its own.

    Template structure for an institutional RDM policy

    Reviewing current UK institutional policies (Edinburgh, Southampton, Manchester, Birmingham, Cambridge) shows a consistent structural skeleton. A new or revised policy should include, in order:

    • Purpose and scope — why the policy exists, and which staff, students and data types it covers.
    • Definition of research data — the institution’s own working definition (the UKRI Concordat’s is a common starting point: digital or analogue information collected, observed or created to validate research findings).
    • Roles and responsibilities — who is the data owner by default (usually the institution), who is the data steward (usually the principal investigator), and what the research office, IT services and library each provide.
    • Data management planning requirement — a mandate that a DMP must exist for every funded (and, ideally, every unfunded) research project, and where that requirement sits relative to ethics approval.
    • Storage and security tiers — approved storage classes mapped to data sensitivity.
    • Retention and disposal — minimum retention period, and the trigger for review or deletion.
    • Sharing, access and FAIR compliance — the institution’s default position on open data, exceptions for confidentiality, and adherence to the FAIR principles (Findable, Accessible, Interoperable, Reusable), as defined by Wilkinson et al. in Scientific Data (2016).
    • Legal and ethical compliance — UK GDPR and Data Protection Act 2018 obligations for personal data, plus any sector-specific requirements.
    • Review cycle and ownership of the policy itself — who revises it and how often.

    This ordering matters: policies that lead with storage and IT detail before establishing roles tend to read as IT documents rather than governance ones, which weakens researcher buy-in.

    Retention, ownership and storage tiers

    Retention should be set as a minimum, not a target. A commonly cited UK baseline is three years from project end or publication, with the caveat that funder, sponsor or disciplinary requirements specifying longer periods take precedence — clinical and health-related data, for example, routinely requires 10-15 year retention under separate regulatory regimes.

    Ownership defaults matter because researchers move institutions far more often than data does. Most UK institutional policies assign underlying ownership of research data to the institution as the legal entity that employed the researcher and typically held the grant, while the principal investigator retains stewardship responsibility — the practical duty of care — during and after the project. This split must be stated explicitly, not left implicit, because it is the clause institutions rely on when a departing researcher wants to take data with them.

    Storage tiers should be mapped to data sensitivity rather than treated as one undifferentiated pool. A workable minimum is three tiers:

    • Tier 1 — open/shareable: deposited in a Re3data-listed, CoreTrustSeal-certified repository with a DOI via DataCite.
    • Tier 2 — restricted/sensitive: access-controlled institutional storage under a data sharing agreement.
    • Tier 3 — confidential/personal: encrypted storage meeting UK GDPR requirements, with a Data Protection Impact Assessment on file.

    Researcher obligations and governance roles

    The policy should state researcher obligations as directives, not suggestions. At minimum, researchers are required to: complete a DMP before data collection begins; store active data only in institutionally approved systems; register externally held datasets with the institution; and provide a data access statement or citation in any publication when the underlying data are not directly deposited.

    Governance sits across three functions the policy must name individually: the research office (grant compliance, costing RDM into proposals — UKRI states that RDM costs are eligible under its funding), IT services (approved storage infrastructure and security), and the library or research data service (repository operation, metadata standards, researcher training). ARMA and INORMS provide sector benchmarking for how these research administration roles are typically distributed across institutions.

    Common questions

    What is the difference between a research data management policy and a data management plan?

    A research data management policy is an institution-wide governance document setting defaults for ownership, retention and storage. A data management plan is a project-specific document, usually required by a funder at proposal stage, that details how one project’s data will be collected, stored and shared within those institutional defaults.

    Who is responsible for research data management at an institution?

    Responsibility is shared but must be explicitly assigned. The principal investigator is typically the data steward for a given project; the institution holds underlying ownership; and the research office, IT services and library provide the supporting infrastructure, costing advice and repository services the policy commits to.

    How long should institutions retain research data?

    Most UK institutional policies set a minimum retention period of three years from project end or publication, deferring to longer funder-, sponsor- or discipline-specific requirements where they apply — for example, clinical research data typically requires substantially longer retention under separate regulatory regimes.

    What does FAIR data mean in a research data management policy?

    FAIR stands for Findable, Accessible, Interoperable and Reusable — principles defined by Wilkinson et al. (2016) that a policy should require researchers to apply when depositing data, typically through persistent identifiers, standard metadata and appropriate licensing. See the CASRAI research data dictionary for related term definitions.

    Implications for research administrators

    Institutions that only mandate DMPs at grant stage, without an underlying institutional policy, end up with inconsistent retention practice, ambiguous ownership when staff leave, and duplicated storage costs across departments running incompatible systems. Writing the institutional policy first — using the structure above — gives every subsequent DMP a consistent, auditable baseline, and gives research offices a defensible answer when a funder, ethics committee, or departing researcher asks who owns what and for how long.

    As RDM costs are increasingly built into grants and UK institutions face growing FOI and audit scrutiny of data retention, the institutional policy is the operational backbone that per-project DMPs are supposed to inherit from, not replace.

  • DMPonline vs DMPTool vs Argos: DMP Tool Guide

    DMPonline, DMPTool and Argos are the three leading platforms for writing a data management plan (DMP): DMPonline (Digital Curation Centre, UK) and DMPTool (California Digital Library, US) share the same open-source DMP Roadmap codebase, while Argos (OpenAIRE) is built for machine-actionable, European open-science workflows. The right choice depends on your funder’s templates, whether your institution offers a branded instance, and whether you need structured API export.

    A data management plan tool is software that walks a researcher through funder- and institution-specific questions, stores the resulting answers as a structured document, and — increasingly — exports that document in a machine-readable format rather than as static prose. DMPonline is the Digital Curation Centre’s web-based DMP-writing service, built on the open-source DMP Roadmap platform it co-develops with the California Digital Library. This guide compares it against DMPTool and Argos on the three factors that actually decide adoption: funder-template coverage, institutional branding, and API export.

    What is DMPonline, and who runs it?

    DMPonline is a free web application, developed and hosted by the Digital Curation Centre (DCC), based at the University of Edinburgh. It supports researchers in producing a data management plan against a specific funder or institutional template, with embedded guidance text at each question. It is the standard reference tool for UK Research and Innovation (UKRI) grant-holders and is widely adopted across UK and European universities.

    Many institutions run their own branded instance rather than sending researchers to the generic service — the University of Manchester, University of Sheffield, University of Plymouth and University of Exeter all operate dedicated DMPonline subdomains with local templates and guidance layered on top of the shared DCC platform.

    DMPonline vs DMPTool: same codebase, different communities

    DMPonline and DMPTool are not separate products built by rival teams — they run on the same open-source codebase, DMP Roadmap, jointly developed by the DCC and the California Digital Library (CDL). The practical difference is community and funder coverage, not underlying functionality.

    DMPTool, operated by the CDL (part of the University of California system), is the default choice for US-based researchers, carrying templates for agencies such as the National Science Foundation (NSF) and National Institutes of Health (NIH). DMPonline carries the equivalent depth for UK and European funders, including UKRI’s constituent research councils and Wellcome Trust. Because both draw on the same codebase, a plan exported from either tool follows a broadly comparable data model — the divergence sits in which templates, guidance text and institutional branding are pre-loaded, not in the software itself.

    What is Argos, and how does it differ?

    Argos is a DMP-writing platform developed within OpenAIRE, the European open-science infrastructure, rather than from the DMP Roadmap lineage. Argos was designed from the outset around machine-actionable output, producing plans as structured objects intended to connect into the wider European research-information graph rather than sit as a standalone PDF.

    Its templates lean towards Horizon Europe and European Research Council (ERC) requirements, and its architecture emphasises linking a DMP’s contents — datasets, repositories, funders, organisations — to persistent identifiers already circulating in the OpenAIRE Research Graph. For a European-funded project embedded in that ecosystem, this integration is a genuine functional difference, not just a branding one.

    Funder-template coverage: which tool fits your funder

    Template coverage is usually the deciding factor, since a funder-specific template determines exactly which questions a plan must answer. The table below summarises where each platform’s template strength lies.

    Platform Steward Strongest funder coverage Typical user base
    DMPonline Digital Curation Centre UKRI councils, Wellcome Trust, UK institutional templates UK and European universities
    DMPTool California Digital Library NSF, NIH, US federal agency templates US universities and research institutes
    Argos OpenAIRE Horizon Europe, ERC, EOSC-aligned funders European open-science projects

    None of the three restricts researchers to their “home” funder templates — DMPonline hosts non-UK institutional templates, and DMPTool lists non-US funders too — but the depth of guidance and the freshness of template maintenance concentrate where each tool’s steward organisation has direct funder relationships.

    Institutional branding and API export compared

    Beyond templates, two practical factors distinguish the tools for an institution deciding which one to adopt.

    • Institutional branding. Both DMPonline and DMPTool support institution-specific branded sub-sites — a university can present its own logo, guidance text and curated template list under its own subdomain while the underlying platform remains centrally maintained. Argos, built for the OpenAIRE/EOSC ecosystem, is more typically deployed as a shared service with organisation profiles rather than fully white-labelled institutional instances.
    • API and machine-actionable export. All three platforms are converging on the RDA DMP Common Standard, developed by the Research Data Alliance’s working group on machine-actionable DMPs, which defines a shared JSON structure for exporting plan content. This is what allows a plan written in one tool to be read, in principle, by a funder system, a repository, or a research-information system rather than only by a human reader.

    For research administrators evaluating tools as part of broader research administration workflows, the practical question is less “which tool is best” and more “which tool’s export format and branding options integrate with our existing repository, CRIS and grants-management systems”.

    Common questions about choosing a DMP tool

    Do I need a data management plan?

    Most major funders — including UKRI, Wellcome Trust, the NSF, the NIH and Horizon Europe — require a data management plan as a condition of funding. If your grant application names one of these funders, you need a DMP, and using DMPonline, DMPTool or Argos is the fastest route to a compliant one.

    How do I write a data management plan?

    Writing a DMP means working through a funder-specific template — covering what data you will create, how it will be documented, where it will be stored, and how it will be shared or preserved. DMPonline, DMPTool and Argos each provide the relevant template with embedded guidance, rather than requiring you to draft one from a blank page.

    What is included in a data management plan?

    A DMP typically covers the types of data to be produced, the metadata and documentation standards used, access and sharing policies, and the plan for long-term archiving and preservation. Machine-actionable tools structure these elements so they can be exported and reused by other systems, not just read once.

    Choosing a tool: what the decision actually hinges on

    Because DMPonline, DMPTool and Argos are all converging on the same RDA DMP Common Standard for export, the choice between them is rarely a compatibility question. It comes down to fit: which platform already carries deep templates for your funder, whether your institution operates a branded instance you are expected to use, and whether your downstream systems consume RDA-conformant JSON export.

    For a UK or European researcher working with UKRI or Wellcome funding, DMPonline is the default starting point. For a US researcher working with NSF or NIH funding, DMPTool serves the equivalent role. For a Horizon Europe or ERC-funded project deeply embedded in the EOSC ecosystem, Argos’s machine-actionability and graph integration make it the stronger fit. As the RDA Common Standard matures further, expect the practical differences between the three to narrow to templates and branding alone, with export interoperability becoming a solved problem rather than a selection criterion.

  • Research Data Governance: Where DMPs, FAIR and Institutional Policy Meet

    Research data governance is the institution-wide framework of policies, roles and standards that determines how research data is created, stored, protected, shared and retained across its lifecycle — distinct from the project-level task of managing a single dataset. It sits above data management plans (DMPs) and FAIR practice, translating funder and institutional policy into assigned accountability. The most common failure point is not the policy itself but the gap between what a DMP promises and what a principal investigator (PI) or data steward is actually resourced and empowered to deliver.

    Put simply: research data governance is the system of institutional authority, roles and control that determines who is accountable for a dataset at every stage of its life, from collection to eventual disposal or archiving.

    What is research data governance?

    Research data governance establishes the policies, roles and standards dictating how research data is ethically collected, stored, secured and shared, applied at the level of the whole institution rather than a single grant. It differs from research data management in scope: management is what a researcher does with one dataset; governance is how an organisation ensures every dataset is handled consistently and lawfully.

    Andrea Chiarelli’s 2023 analysis for Force11’s Upstream describes this as a shift “from individual projects or datasets to the way the organisation as a whole thinks and operates when it comes to research data.” A 2025 Data Science Journal paper by Odebrecht et al. argues governance requires a “system of cross-organisational” accountability, since ownership, stewardship and compliance obligations rarely sit with one office.

    In practice, governance frameworks typically assign roles across several functions:

    • Senior leadership — sets institutional strategy and secures infrastructure budget.
    • Data stewards or data champions — provide discipline-specific guidance and training.
    • Librarians and information professionals — curate data and advocate for open sharing.
    • Ethics and compliance officers — verify adherence to regulatory and funder requirements.
    • IT and information security teams — manage storage, backup and access control.
    • Principal investigators — remain directly responsible for their project’s data day to day.

    How do data management plans fit into research data governance?

    A data management plan is the project-level instrument; research data governance is the institutional context that shapes it. Governance sets the rules of the road — the DMP is the trip plan for a specific project, describing what data will be generated, how it will be stored, and what happens to it once funding ends. Most UK and EU funders now require a DMP at application stage, per the Digital Curation Centre’s funder-policy overview.

    UKRI’s Guidance on Best Practice in the Management of Research Data (2020) states research data should be “easily discoverable, accessible, assessable, intelligible, useable” — language drawn from the G8 Open Data Charter. That expectation only becomes operational once a governance framework specifies which repository, metadata schema and retention period satisfy it. Without that translation layer, a PI can write a technically compliant DMP the institution has no infrastructure to support.

    Where personal or sensitive data is involved, governance also requires a Data Protection Impact Assessment (DPIA) under UK GDPR before collection begins — a step outside most DMP templates, and frequently where research ethics and governance approval stalls.

    Where do FAIR principles sit in the governance stack?

    The FAIR Guiding Principles — Findable, Accessible, Interoperable and Reusable — were formally published in Scientific Data in 2016 (Wilkinson et al.) and have since become the default technical standard governance frameworks use to operationalise “good data practice.” FAIR is a set of design criteria for datasets; governance is the accountability structure that ensures those criteria are met at scale, not just described in policy.

    A governance policy might mandate persistent identifiers, controlled-vocabulary metadata and an approved repository — the mechanisms that make a dataset FAIR in practice. Funder mandates reinforce this: cOAlition S’s Plan S requires data underlying publications be made available in a FAIR-compliant repository, converting a technical principle into a compliance condition an institution’s governance office must monitor.

    Layer What it governs Primary owner
    Institutional research policy Ownership, retention, ethical boundaries Senior leadership / research office
    Research data governance framework Roles, accountability, infrastructure standards Data governance committee
    FAIR principles Technical findability/reuse criteria for datasets Data stewards, repository managers
    Data management plan Project-specific application of the above Principal investigator

    Where do responsibility gaps appear between data stewards and PIs?

    The most persistent governance failure is not absent policy but an accountability vacuum between those who write institutional standards and those who generate the data day to day. Force11’s Upstream analysis notes “research cultures value autonomy and independence,” making a standardised framework structurally difficult to enforce against individual research groups — a cultural, not merely technical, obstacle.

    The gap tends to open at predictable points:

    • Departure events — what happens to a dataset when a researcher leaves is, per Upstream, “one of the most common difficulties,” since ownership and access rights are rarely settled in advance.
    • Metadata quality — without an assigned data steward, a PI defaults to whatever documentation is fastest, not what is FAIR-reusable.
    • Sensitive data handling — a DPIA is approved at the outset, but ongoing access-control enforcement typically falls back to the PI’s lab, unsupported by IT.
    • Retention beyond project end — a retention period is set, but archiving budget and ownership after a grant closes is frequently unassigned.

    The University of Oxford’s data governance framework addresses this by “establishing roles, definitions, standards and procedures to help keep data accurate and fit for purpose” — an explicit attempt to move responsibility off the individual researcher and onto a named institutional function. Institutions without an equivalent role map leave every gap to default to the PI, regardless of whether they have the time, training or authority to close it.

    Frequently asked questions

    What is data governance in research?

    Data governance in research is the exercise of institutional authority and control over how research data is created, secured, shared and retained, increasing the value of research data while minimising risk, and covering ownership, quality, ethical compliance and long-term stewardship across every supported project.

    What are the four pillars of research data governance?

    Most frameworks converge on four pillars: policy (rules for ownership, access and retention), roles (stewards, ethics officers, IT, PIs), infrastructure (repositories, metadata standards, storage) and compliance monitoring (audits against funder and legal requirements). Each pillar fails independently if the others are absent.

    What are the 5 C’s of data governance?

    The 5 C’s — clear vision, leadership commitment, collaboration, communication and continuous improvement — describe the cultural conditions a governance programme needs to survive contact with autonomous research groups. Without leadership commitment specifically, governance policy tends to remain aspirational rather than enforced.

    Will AI replace research data governance?

    No. AI tools can automate metadata tagging, anomaly detection and compliance checks, but they cannot assign accountability or resolve the ethical judgement calls that research ethics and governance committees make. AI changes the tooling of governance, not the underlying need for named, human-accountable roles.

    Implications for institutions

    For research administrators, the practical implication is that a DMP template or FAIR-compliance checklist is necessary but not sufficient. An institution needs a named governance owner — a research data governance committee or chief data steward function — whose remit spans the full lifecycle, not just the application stage a DMP covers.

    The Royal Society and British Academy’s joint review, Data Management and Use: Governance in the 21st Century, argued data governance should be treated as an organisational capability comparable to financial or ethical governance, not a bolt-on exercise assigned to whichever office has spare capacity. That framing is increasingly reflected in how EARMA, ARMA and INORMS member institutions structure research administration functions, positioning data governance alongside grants management and research integrity rather than beneath IT.

    Conclusion: closing the gap

    Research data governance, DMPs and FAIR practice describe the same problem from three altitudes: institutional accountability, project-level planning, and technical dataset design. The responsibility gaps undermining all three consistently form where policy assigns an outcome — FAIR metadata, secure retention, a departure protocol — without assigning a person. Institutions that name an accountable role for every governance obligation, rather than defaulting to the PI, close that gap before it becomes a compliance failure. For broader context on these roles within the wider research administration function, see CASRAI’s research administration standards resources.

  • What Is a Data Management Plan? UKRI, NIH and EU Essentials

    A data management plan (DMP) is a formal document that sets out how a research project will collect, document, store, secure, share, and preserve its data, from the first data point to long-term archiving. Most major funders — including UKRI, the US National Institutes of Health (NIH), and Horizon Europe — now require a DMP at application stage, and increasingly expect it to be aligned with the FAIR principles. This explainer defines a data management plan, sets out why funders mandate one, breaks down its core components, and maps each section onto FAIR.

    A data management plan is, in one sentence: a written commitment describing what data a project will generate, how that data will be organised and protected while the project runs, and how — or whether — it will be shared and preserved once the project ends.

    What is a data management plan?

    A data management plan is a structured, funder- or institution-facing document describing how a project will handle its data across the full research lifecycle. It is drafted at proposal stage, before data collection begins, and treated as a living document revisited as the project evolves.

    A DMP is not a policy statement bolted onto a grant application. Reviewers use it to check that an applicant has thought through data volumes, storage costs, ethical constraints, and sharing obligations before funding is committed. Institutions use it to assign responsibility for storage and eventual deposit; funders use it to enforce open-data commitments after the award is made.

    Why do funders require a data management plan?

    Funders require a DMP because public and charitable research funding carries an expectation that resulting data — not just the resulting publication — is managed responsibly and, where possible, made available for verification and reuse. A DMP is the mechanism funders use to check this before they pay for the research, and to hold grantees to it afterwards.

    The three funders named in this explainer take slightly different approaches to timing and enforcement:

    Funder Governing policy When the DMP is due How it is enforced
    UKRI UKRI Common Principles on Data Policy, implemented per council (e.g. MRC, NERC) At proposal stage Assessed during peer review; council-specific detail expected proportional to data volume
    NIH NIH Data Management and Sharing (DMS) Policy, effective 25 January 2023 At application stage, for all NIH grants that generate scientific data Formal element of merit review; compliance with the approved plan is a condition of the award
    Horizon Europe Horizon Europe Data Management Plan requirement under the Model Grant Agreement A summary at proposal stage; the full DMP is due by month six and updated through the project Grant-agreement condition, monitored through periodic and final reporting

    The NIH policy is a useful marker of where funder expectations are heading: before January 2023, only NIH grants that explicitly generated large datasets needed a plan. Since that date, a Data Management and Sharing Plan is required for essentially all NIH-funded research that produces scientific data, replacing the earlier, narrower DMP requirement. Horizon Europe applies the principle “as open as possible, as closed as necessary” — data defaults to open, and any restriction must be justified in the plan itself, typically via deposit in European Open Science Cloud (EOSC)-federated infrastructure.

    What are the core components of a data management plan?

    What is included in a data management plan varies slightly by funder template, but nearly every DMP — UK, US, or EU — covers the same five areas:

    • Data types and volume: what kinds of data the project will generate or reuse (numerical, image, text, biological samples, code), in what formats, and at roughly what scale.
    • Documentation and metadata: how the data will be described so a third party — or the researcher, eighteen months later — can understand and reuse it without asking the original team.
    • Storage and security: where data will live during the project, how it is backed up, and who has access, particularly for sensitive or identifiable data.
    • Sharing and preservation: which data will be shared, through which repository, on what timeline, and which data will not be shared, with a stated justification.
    • Ethics, consent, and legal compliance: how personal, sensitive, or Indigenous data will be handled under relevant data-protection law and participant consent terms, and how intellectual-property or commercial-sensitivity constraints are addressed.

    A sixth element, often folded into the above, is roles and responsibilities: naming who on the project team is accountable for each of these tasks, since a DMP with no named owner tends not to get implemented.

    How do FAIR principles map onto a data management plan?

    The FAIR principles — Findable, Accessible, Interoperable, Reusable, published in Scientific Data in 2016 — are now the reference framework funders use to judge whether a DMP’s sharing commitments are substantive rather than nominal. Each FAIR letter corresponds to a specific, checkable DMP section:

    FAIR principle DMP section it governs What a reviewer checks for
    Findable Documentation and metadata A persistent identifier (e.g. a DOI) and rich, indexed metadata assigned at deposit
    Accessible Storage and sharing A stated repository and access protocol, plus clear conditions where access is restricted
    Interoperable Data types and formats Use of standard, non-proprietary formats and controlled vocabularies rather than bespoke formats
    Reusable Preservation and licensing A clear usage licence, provenance information, and community data standards followed at deposit

    This mapping is why a DMP written purely as a compliance checklist tends to fail review: a plan can name a repository (satisfying Accessible) while leaving metadata and licensing (Findable and Reusable) unaddressed, and a funder assessor trained on FAIR criteria will flag the gap.

    Common questions about data management plans

    What is in a data management plan?

    A DMP typically sets out the types of data to be produced, the metadata standards used to describe them, the storage and backup arrangements during the project, the access and sharing policy, and the plan for long-term archiving so the data remains usable after the project ends.

    How do you write a data management plan?

    Start from the funder’s own template rather than a blank page, since UKRI, NIH, and Horizon Europe each specify required headings. Describe data types and volumes first, then storage, ethics, and sharing, and be explicit about what will not be shared and why — a stated exception is stronger than a silent gap.

    Do I need a data management plan?

    If the project is funded by a body with a research-data policy — which now includes most major UK, US, and EU funders — a DMP is mandatory at application stage, not optional. Institutions increasingly also require one for internally funded or unfunded projects that handle sensitive data, as a matter of good practice.

    What does a good data management plan look like?

    A strong DMP is specific rather than generic: it names an actual repository rather than “a suitable repository,” gives a realistic storage volume, and assigns a named person to each task. It is written to be checked against, not filed and forgotten — funders increasingly audit compliance with the plan they approved, not just its existence.

    What this means for researchers and institutions

    Why data management plans matter is shifting from a compliance formality to an operational one. NIH’s move to require a plan for essentially all data-generating awards, not just large-dataset ones, signals broadening rather than narrowing scrutiny. Horizon Europe’s mid-project update requirement means the document cannot be written once and ignored; it is checked against actual practice at reporting milestones.

    For institutions, this means DMP-writing guidance, repository access, and named data stewards are becoming a baseline service rather than a specialist offering — mirroring how research-administration functions increasingly treat authorship, funding acknowledgement, and data policy as connected obligations. For individual researchers, treating the DMP as a working document rather than a one-off application formality is now the defensible position across every major funder covered here.

    For funder-specific DMP templates and requirements, consult the relevant funder’s own guidance pages; for the broader compliance context these plans sit within, see CASRAI’s research administration resources and research-terminology dictionary.

  • Tri-Agency Research Data Management Policy 2026

    Canada’s Tri-Agency Research Data Management Policy requires postsecondary institutions and research hospitals that administer funds from CIHR, NSERC and SSHRC to publish an institutional research data management (RDM) strategy, to attach data management plans (DMPs) to specified grant applications, and to prepare for a phased-in data deposit requirement. Launched in March 2021, it is Canada’s first cross-agency RDM mandate.

    The Tri-Agency Research Data Management Policy is a joint funder mandate issued by the Canadian Institutes of Health Research (CIHR), the Natural Sciences and Engineering Research Council of Canada (NSERC) and the Social Sciences and Humanities Research Council of Canada (SSHRC) that sets out institutional and researcher obligations for managing publicly funded research data.

    What is the Tri-Agency Research Data Management Policy?

    The policy was announced jointly by CIHR, NSERC and SSHRC on 18 March 2021, following consultation on a 2018 draft. Its stated objective is to support Canadian research excellence by promoting sound RDM and data stewardship practices across the postsecondary and hospital-based research sectors that receive federal funding.

    Unlike a single-agency requirement, it applies uniformly across all three funding councils, making it the first unified Canadian federal RDM mandate. The agencies chose incremental implementation rather than a single compliance date, phasing obligations in over several years to give institutions and researchers time to build capacity.

    What are the policy’s three pillars?

    The policy rests on three distinct requirements, each with its own timeline and audience. Together they move Canadian-funded research toward the FAIR principles — Findable, Accessible, Interoperable and Reusable — as defined by SSHRC’s guidance for applicants.

    • Institutional strategies: eligible institutions had to develop and publicly post an RDM strategy, then formally notify the agencies of completion by 1 March 2023.
    • Data management plans (DMPs): researchers applying to an initial, agency-specified set of funding opportunities must submit a DMP describing how project data will be collected, documented, stored and shared.
    • Data deposit: grant recipients must eventually deposit digital research data, metadata and code that directly support published conclusions into a recognised repository; the agencies are phasing this in based on the sector’s readiness rather than enforcing it on a fixed date.

    The policy also embeds Indigenous data governance explicitly: institutional strategies must recognise the data sovereignty of First Nations, Métis and Inuit communities, with SSHRC pointing applicants to the First Nations OCAP® principles and the CARE Principles for Indigenous Data Governance.

    How do CIHR, NSERC and SSHRC requirements differ?

    All three agencies operate under one shared policy text, but they do not require DMPs on the same grants. Each agency independently designates which of its own funding opportunities carry a mandatory DMP, published on the shared Science.gc.ca research data management page rather than in a single combined list.

    • CIHR has applied DMP requirements to specific health-research competitions, reflecting added sensitivity around personal health information and research ethics board obligations.
    • NSERC has targeted DMPs at select discovery and strategic programmes in the natural sciences and engineering.
    • SSHRC requires DMPs for designated social sciences and humanities opportunities and publishes the most detailed applicant-facing guidance, including a section-by-section drafting template covering data collection, documentation, storage, sharing, responsibilities and legal compliance.

    Institutions are expected to track which of their researchers’ target competitions are in scope, since the DMP obligation is opportunity-specific rather than blanket across every Tri-Agency grant.

    How does it compare with UKRI, NSF and Horizon Europe?

    Canada’s approach sits between the narrower, opportunity-specific model used historically in the UK and the near-universal mandates now standard in the United States and the European Union. The table below sets out the structural differences institutions moving between these funding systems need to track.

    Framework Steward DMP scope Data deposit approach
    Tri-Agency RDM Policy CIHR / NSERC / SSHRC (Canada) Required only for agency-specified funding opportunities Phased in based on sector readiness; not yet universal
    UKRI Common Principles on Data Policy UK Research and Innovation, across its constituent councils Expected for research councils such as MRC, NERC and EPSRC, per council-specific policy Data expected to be made available and accessible at the point of publication
    NSF Data Management Plan requirement US National Science Foundation A DMP has been mandatory for every NSF proposal, across all directorates, since 2011 Sharing plan required; no single universal deposit mandate across directorates
    Horizon Europe Model Grant Agreement European Commission DMP mandatory for participating projects, typically due by month 6 Open access to research data “as open as possible, as closed as necessary”

    The practical distinction for institutions with international collaborators is scope: NSF and Horizon Europe treat the DMP as a near-default project requirement, while the Tri-Agency policy and UKRI’s council-by-council approach both still gate the DMP requirement to specific competitions rather than every grant.

    What must institutions do to comply?

    Institutional research offices carry most of the compliance burden, since the policy places the strategy obligation on the institution rather than the individual researcher. Compliance work typically covers four areas.

    • Publish and maintain an institutional RDM strategy on a publicly accessible page, with a named contact for enquiries.
    • Build institutional capacity: training, data storage infrastructure, and support for researchers drafting DMPs, often via the DMP Assistant tool operated by the Digital Research Alliance of Canada.
    • Track which specific CIHR, NSERC and SSHRC funding opportunities carry a mandatory DMP so applicants are not caught unprepared at submission.
    • Prepare repository infrastructure and researcher guidance ahead of the phased data deposit requirement, including institutional or national options such as the Federated Research Data Repository.

    Institutions that have not yet published a strategy remain out of step with a requirement the agencies set for 1 March 2023, which is a governance gap research offices should treat as a priority remediation item.

    Frequently asked questions

    When did institutions have to publish their Tri-Agency RDM strategy?

    Institutions eligible to administer CIHR, NSERC or SSHRC funds were required to develop, publicly post and notify the agencies of their institutional RDM strategy by 1 March 2023. This is the only fixed compliance date within the otherwise incrementally phased policy.

    Does every Tri-Agency grant require a data management plan?

    No. Each agency designates its own initial set of funding opportunities that require a DMP at application; the requirement is not blanket across all CIHR, NSERC or SSHRC competitions, so applicants must check the specific programme guidelines before submitting.

    What do institutions need to know about FAIR data under the policy?

    SSHRC’s applicant guidance directs researchers to manage data, where ethically and legally possible, according to the FAIR principles — Findable, Accessible, Interoperable and Reusable — while explicitly noting that grant recipients are not required to openly share data if legal, ethical or Indigenous data sovereignty obligations prevent it.

    How does the policy treat Indigenous research data?

    Institutional strategies must recognise Indigenous data sovereignty, and SSHRC points applicants to the First Nations OCAP® principles and the CARE Principles for Indigenous Data Governance when data involves First Nations, Métis or Inuit communities and their collections.

    Implications and outlook

    For institutions, the Tri-Agency policy converts research data management from a discretionary practice into a governance obligation with a named public strategy, a training mandate and eventual deposit infrastructure requirements. Research offices that treat the 2023 strategy deadline as complete, rather than as a living document, risk falling behind as the agencies phase in data deposit.

    For researchers collaborating internationally, the comparison with UKRI, NSF and Horizon Europe matters operationally: a DMP built for an NSF-funded partner project, where a plan is mandatory for every proposal, will not automatically satisfy a Tri-Agency opportunity where the DMP requirement is competition-specific — and vice versa. Institutions running multi-funder projects should map DMP and deposit obligations per funder rather than assuming one plan transfers across systems.

    As Canada’s data deposit requirement moves from phased design toward implementation, institutions with mature repository infrastructure and clear researcher guidance will be better positioned than those still relying solely on their 2023 strategy document. This sits within the wider discipline of research administration, where funder RDM mandates increasingly intersect with data governance, ethics review and open-access policy.