Tag: FAIR

  • Reading the RDA DMP Common Standard v2 changelog

    The Research Data Alliance’s DMP Common Standard, originally published as v1 in 2019, has been the canonical machine-readable schema for machine-actionable data management plans. The v2 revision, released early in 2026 after a two-year community consultation, reorganises the schema, tightens validation, and adds explicit support for software, samples, and instruments alongside data. This post is a field-by-field walkthrough for integrators.

    Why v2 was needed

    v1 was designed in the FAIR-data heyday and is fit for that purpose. Three forces produced the v2 revision. First, the scope of “things to be managed” expanded: software is increasingly part of the planning conversation, samples are increasingly recognised as research outputs that need governance, instruments have their own management lifecycle. v1 could be coaxed into representing these but the structure was uncomfortable. Second, validation: v1 was permissive in the ways that mature standards usually tighten over time. Optional fields proliferated, the meaning of several fields was under-specified, and tools produced incompatible serialisations within nominal compliance. Third, FAIR-implementation maturity: v2 needed to carry the metadata that the FAIR data-assessment frameworks (FAIRsharing, F-UJI, FAIR Implementation Profiles) had converged on requiring.

    Structural changes

    The top-level structure of v2 is reorganised. v1 had dmp as the root with sub-resources for dataset, contact, contributor, project, cost. v2 keeps dmp as the root but moves to a managed_resources array that contains heterogeneous resource types: dataset, software, sample, instrument, other. Each resource type has its own schema with shared common fields (identifier, title, description, distribution, license, contributors with CRediT roles) and type-specific fields.

    This is the biggest structural change and the one that will require migration effort. v1 DMPs with only datasets can be projected into v2 cleanly; v1 DMPs that used the dataset structure to represent software or samples (the common workaround) will need restructuring on migration.

    Identifier requirements

    v2 tightens identifier requirements. Every managed resource must declare its identifier with a type (DOI, Handle, URL, ARK, IGSN for samples, RAiD for projects); the identifier validation runs against the type. The contributor structure requires ORCID iDs for individuals; the affiliation structure requires ROR IDs for organisations. The project structure requires either a RAiD or a Funder Registry grant identifier or both.

    The validation tightening is the integration-relevant change. v1 tools that accepted strings in identifier fields will produce DMPs that fail v2 validation. Integrators should review their PID-handling logic before migrating.

    The CRediT integration

    v2 carries CRediT roles natively in the contributor structure. Each contributor on a DMP can be assigned one or more CRediT roles with the degree-of-contribution qualifier. The expected pattern is that DMP contributors (the people who will manage the data) get roles like Data curation, Resources, Project administration, Supervision, with the actual research contributors getting their roles assigned later via the publication or dataset metadata. The DMP captures the data-management contributorship, not the research contributorship.

    FAIR alignment

    v2 includes a fair_assessment structure that aligns with the FAIR Implementation Profile framework. Each managed resource can carry an FIP reference, and the DMP-level fair_assessment can declare which FAIRsharing-registered standards, identifiers, and policies the resources conform to. This is the bridge between DMP-as-planning-document and DMP-as-FAIR-compliance-record.

    The cost structure

    v2 reworks the cost structure. v1 had a flat cost array with title, description, value, currency. v2 categorises costs by phase (acquisition, processing, storage-active, storage-preservation, dissemination, decommissioning) and by recurrence (one-off versus annual), which lets a DMP integrate with funder budget structures and institutional cost-recovery models. The cost structure is optional but recommended.

    Lifecycle status

    v2 introduces an explicit lifecycle field on each managed resource, with controlled values covering planning, acquisition, processing, analysis, dissemination, preservation, decommissioning. A DMP can describe resources at different lifecycle stages, which lets the same DMP serve as both a project-launch plan and a project-completion record. This addresses a v1 friction where DMPs were either pre-project plans or post-project records and the same document could not serve both purposes cleanly.

    Migration path

    RDA’s recommendation is that v1 DMPs remain valid through 2027 and that v2 DMPs become the preferred format from 2026. Tools producing or consuming DMPs should support both formats during the transition window. The official RDA migration guide covers the field mappings; the CASRAI maDMP domain tracks tool-vendor support as it rolls out.

    Specific migration considerations: v1 datasets representing software should be re-categorised as v2 software resources; v1 datasets representing samples should be re-categorised as v2 sample resources; v1 contributors without ORCID iDs need ORCID iDs added before v2 validation will pass; v1 organisational affiliations need ROR IDs.

    Tool support

    By the v2 release, the major DMP tools (DMPonline, DMPTool, easyDMP, ARGOS) had announced support timelines. DMPonline and DMPTool committed to v2 export support in mid-2026 with import support later in the year. easyDMP shipped v2 support at the v2 release. ARGOS has v2 support in beta. Institutional and funder DMP services built on these tools inherit their support timelines.

    The funder side of the integration is also moving. Several major funders had been ingesting v1 DMPs in machine-readable form for compliance tracking; the v2 transition gives them an opportunity to tighten ingestion validation and to use the richer FAIR-assessment structure. The CASRAI funder maDMP guide walks through the funder-side migration.

    What this enables

    The longer-term value of v2 is in the queries it enables across machine-readable DMPs at scale. A funder can ask: across all funded projects starting in 2026, which fraction declared FAIR-compliant deposit plans for their datasets, software, and samples? — and get a structured answer. An institution can ask: which projects on our books have DMPs that declare preservation costs, and what is our aggregate preservation-cost commitment? — and get a structured answer. v1 made these queries notionally possible; v2 makes them reliable.

    For research administration, the practical posture in 2026 is to follow the funder migration. Where funders require v2, migrate; where they still accept v1, produce both. The migration is one-way (a v2 DMP cannot be cleanly downgraded to v1 if the new resource types are used) but the v1-to-v2 path is well-supported.

    Related dictionary entries

  • EOSC Federation governance: what changed in 2026

    The European Open Science Cloud (EOSC) entered its second formal phase at the start of 2026 with a substantially revised governance and funding model. The transition from the EOSC Association-led implementation phase (2021-2025) to the EOSC Federation operating model (2026 onwards) is the most significant infrastructure-governance change in EU open science since the FAIR principles were articulated. This post walks through what changed, who is affected, and how integrators should reposition.

    What EOSC was, and what it is becoming

    EOSC was conceived in 2016 as a federated cloud-based infrastructure providing researchers across the EU and associated countries with access to data, services, and computing for open science. The 2018 implementation phase, the 2021 launch of the EOSC Association, and the cascade of EU-funded projects (EOSC Future, EOSC-Pillar, FAIRsFAIR, EOSC Synergy, and many others) built out the technical layer: the EOSC Portal, the EOSC Marketplace, the AAI federation, the metadata aggregation via OpenAIRE, the persistent-identifier infrastructure connections.

    The challenge that has dogged EOSC since its inception is the gap between the technical layer and the operational sustainability layer. EU project funding built impressive infrastructure on time-limited grant terms; what happens after the grants run out has been the open question. The 2024-2025 EOSC strategic review concluded that the project-grant model was not sustainable and that EOSC needed a federated operating model funded by recurring contributions from member states and associated members.

    The federation model

    The 2026 EOSC Federation is structured as a tiered membership organisation. Core nodes are member-state-designated infrastructures providing strategic services (data repositories of national scale, compute infrastructure, identity providers, metadata aggregators). Core-node operation is co-funded by the home member state and by EOSC central funds; the criteria for core-node status include FAIR data implementation, sustainability commitments, and federation-API conformance.

    Federated nodes are participating infrastructures meeting a lighter set of conformance criteria; they integrate with the EOSC federation via documented APIs and contribute to the federated graph but are not core-funded. Most existing EOSC-listed services move into federated-node status by default.

    Affiliated services are third-party infrastructures (commercial cloud providers, international partners, non-EU regional clouds) that participate via specific agreements without being members of the federation governance.

    The governance structure has three tiers: a Strategic Council of member-state representatives setting policy; an Operational Board of core-node operators making day-to-day decisions; a Stakeholder Forum of federated nodes, researchers, and user-community representatives providing input.

    What changed for technical integrators

    Three concrete changes matter for technical integrators. First, the EOSC Interoperability Framework v2 ships with the federation launch. It refines the v1 framework with tightened metadata-quality requirements, mandatory ROR organisational IDs for service providers, and standardised provenance metadata for data-services interactions. The CASRAI EOSC IF entry has been updated.

    Second, the EOSC AAI consolidation moves the federation onto a unified authentication-and-authorisation infrastructure based on EduGAIN, MyAccessID, and ORCID. The previous federation of identity providers (with several parallel AAI implementations across projects) is being merged. Service providers integrating with EOSC need to support the consolidated AAI; legacy integrations have a 24-month transition window.

    Third, the persistent-identifier requirements have been tightened. Core-node services must issue or accept persistent identifiers (DOI, Handle, ORCID, ROR, RAiD as appropriate) for all federated artefacts; metadata records without PIDs are no longer ingested into the EOSC Graph. This brings EOSC into line with what OpenAIRE Graph has long been recommending; in practice many services already comply.

    What changed for researchers

    The visible changes for researchers are modest in 2026 and will accumulate through 2027-2028. The EOSC Portal remains the primary discovery surface; the EOSC Marketplace continues to list services with the new conformance tiers visible. The single sign-on becomes more reliable as the AAI consolidation reduces the friction across services.

    The deeper change is in the funder-mandate side. Several Horizon Europe calls in 2026 require EOSC-conformant data deposit (deposit in a core-node or federated-node repository, with FAIR metadata, with PID assignment). This raises the floor: research data from these projects must land somewhere that is part of the EOSC federation, with all the metadata-quality implications. The CASRAI institutional EOSC guide walks through the deposit options by discipline.

    The sustainability question

    The federation’s financial model is its hardest unresolved problem. Member-state contributions, EU central funding, and a small fee component from affiliated services are the three pillars. The 2026 budget is set; the 2027 and beyond budgets depend on the Multiannual Financial Framework negotiations that will roll through 2026 and 2027.

    The risk to the federation is a return to project-grant precarity: if member-state contributions falter, core-node operations become dependent on rolling EU-project funding again, and the long-term sustainability gain is illusory. The EOSC Association has been clear that member-state buy-in is the critical lever; the institutions and researchers who use EOSC infrastructure should be engaging their national funder representatives on the federation contribution case.

    What integrators should do in 2026

    For repository managers, the immediate priorities are: verify your service’s status (core node, federated node, affiliated, none); align your metadata to EOSC IF v2; ensure PID assignment is complete for federated artefacts; migrate to the consolidated AAI within the 24-month window.

    For institutional CRIS systems and CRIS vendors, the priorities are: ingest EOSC-Portal service metadata into the CRIS service-catalogue layer; support EOSC-conformant deposit workflows from the CRIS to federated repositories; surface EOSC-mandate compliance in the CRIS reporting layer. The CASRAI CRIS integration guide has been updated.

    For publishers, the touchpoint is data-availability and code-availability metadata. A paper depositing supporting data in an EOSC federated repository should declare so in its metadata; the EOSC Graph then ingests the linkage. Publishers should be updating their submission systems to capture EOSC-repository deposit identifiers.

    The international dimension

    EOSC has always sat in relation to the broader international open-science infrastructure: the African Open Science Platform, the LA Referencia network in Latin America, OpenAIRE-Nexus, the Asia-Pacific GRDi work. The federation governance explicitly includes a framework for international affiliation, and several non-EU national infrastructures have begun affiliation discussions. The medium-term direction — five to ten years out — is a loosely federated international open-science cloud with EOSC as one of several regional implementations.

    For institutions outside the EU but with significant EU collaboration, the practical implication is that EOSC conformance is becoming a useful target even where it is not mandated. Aligning with EOSC IF v2 and the related FAIR-data infrastructure makes EU collaboration smoother and positions for the broader international federation as it emerges.

    Related dictionary entries

  • FAIR data assessment frameworks: a buyer’s guide for institutions

    The FAIR Principles (Findable, Accessible, Interoperable, Reusable) were published in 2016 and have become the dominant framework for talking about research-data quality. The harder problem – measuring whether a particular dataset, repository, or institutional output is actually FAIR – has produced a small ecosystem of assessment frameworks. In 2026 there are five we recommend institutions consider, and they answer slightly different questions. This post is the practical buyer’s guide.

    The five frameworks that matter

    The frameworks differ along two axes: what they assess (a single dataset, a repository, an institution’s overall position) and how they assess (automated against metadata, structured self-assessment, third-party audit). A well-equipped institution uses two or three of them for different purposes.

    RDA FAIR Data Maturity Model

    The RDA FAIR Data Maturity Model, finalised by the RDA working group in 2020, is the canonical indicator framework. It defines 41 indicators across the four FAIR pillars, each at one of four maturity levels (essential, important, useful, neutral). It does not prescribe the assessment method; it provides the rubric.

    The Maturity Model is the most-cited framework in funder documents and is the de-facto common reference for FAIR assessment. Its strength is interoperability: a tool that calculates against the RDA indicators produces results comparable across institutions and disciplines. Its weakness is that the indicators are abstract; turning them into operational checks requires a tool.

    F-UJI

    F-UJI (FAIRsFAIR Research Data Object Assessment Service), developed by FAIRsFAIR and PANGAEA, is the most-used automated assessment tool. F-UJI takes a single dataset identifier (typically a DOI), retrieves its metadata, and runs a battery of automated checks against the RDA Maturity Model indicators. It produces a numeric FAIR score and a detailed report.

    F-UJI is genuinely useful for dataset-level assessment because it actually fetches and tests the metadata. It catches real failures (missing licence, missing schema declaration, dead landing-page links) that self-assessment tools miss. Its limits are also real: it can only check what is machine-discoverable, so a dataset can score well on F-UJI and still be unusable in practice if the documentation is poor. As of 2026 F-UJI is available as a hosted service and as a self-deployable container.

    FAIR-Aware

    FAIR-Aware, developed by DANS, is a structured self-assessment tool aimed at researchers who are about to deposit a dataset. It asks ten questions about the dataset’s intended preparation and produces guidance on which FAIR principles are being met and which need work. FAIR-Aware is pedagogical rather than evaluative: its purpose is to nudge depositors into thinking about FAIR before they deposit, not to score them afterwards.

    FAIR-Aware is the right tool when an institution is trying to improve deposit quality and researcher FAIR literacy. It is the wrong tool when the question is “how FAIR are our existing holdings.”

    CESSDA self-assessment

    The CESSDA FAIR self-assessment, oriented toward social-science data archives, is closer to a structured repository audit. It asks the repository to evidence its compliance against a published framework that maps to the RDA indicators. CESSDA is interesting because it is discipline-specific: it knows that social-science data has particular consent, sensitivity, and harmonisation patterns and asks questions sensitive to those.

    ARDC FAIR Framework

    The Australian Research Data Commons FAIR framework, developed for the Australian context with funder backing, includes a self-assessment, a checklist for repository operators, and a benchmark for institutional services. The ARDC framework’s strength is that it has been operationalised at scale across Australian universities; its principles translate well but its administrative artefacts are Australia-specific.

    The institutional decision tree

    A reasonable institutional approach in 2026 looks like this:

    1. For overall institutional position: use the RDA Maturity Model as the reference framework. Cite it in policy documents, training materials, and funder reports. It is the common language.
    2. For deposit-time researcher guidance: deploy FAIR-Aware (or your repository’s built-in equivalent) at the deposit interface. The goal is researcher behaviour change, not measurement.
    3. For periodic dataset-quality auditing: run F-UJI against a representative sample of your repository holdings on a quarterly cycle. Use the results to drive metadata quality improvements at the repository level.
    4. For repository certification: pursue CoreTrustSeal certification for any repository whose data underlies cited research outputs. CoreTrustSeal is more rigorous and more useful externally than the FAIR self-assessments.
    5. For discipline-specific work: layer the relevant disciplinary framework (CESSDA for social sciences, the FAIRsharing community standards for life sciences, etc.) on top of the generic frameworks.

    What the frameworks miss

    The frameworks all do well at the F (Findable) and A (Accessible) pillars because these map well to machine-checkable metadata. They do less well at I (Interoperable) and R (Reusable) because interoperability and reusability depend on context that automated tools cannot evaluate (is the data dictionary actually meaningful? do the controlled vocabularies match the relevant community standards? would another researcher in this field find the documentation sufficient?).

    The mitigation is to pair automated assessment with structured peer review of high-value datasets. F-UJI tells you the metadata is well-formed; a peer review tells you the data is actually useful. The institutional practice we have seen working is to run F-UJI quarterly and to commission peer-data-reviews of the institutionally-flagged “strategically important” datasets annually.

    FAIR assessment and CoreTrustSeal

    CoreTrustSeal certification is the closest thing to an external audit of a repository’s trustworthiness. It is more rigorous than FAIR self-assessment because it requires a substantive submission and an external review. It covers governance, sustainability, technical infrastructure, data quality, and discoverability. By 2026 most major institutional and disciplinary repositories are CoreTrustSeal-certified; the certification is increasingly required in funder data-management requirements.

    CoreTrustSeal and FAIR assessment are complementary. CoreTrustSeal asks “is this repository a trustworthy place to deposit data?”; FAIR assessment asks “are the datasets in this repository FAIR?” An institution should be able to answer yes to both.

    Integration with the institutional CRIS

    The 2024-2025 development that has changed institutional practice is the integration of FAIR assessment into the institutional CRIS. Pure, Elements, Converis, and DSpace-CRIS now ship modules that show FAIR scores for each dataset record, computed from the underlying repository deposit. The institutional dashboard can then aggregate (overall FAIR score by department, by year, by funder), spot drift, and flag low-scoring records for improvement.

    The pattern that works is: institutional repository (DSpace, Figshare, Dataverse) exposes dataset metadata; CRIS pulls the metadata daily; F-UJI runs against new and updated records; FAIR score is written back to the CRIS record; institutional dashboards consume the score. The total effort is moderate (a sprint of integration work) and the resulting visibility is genuinely useful.

    The funder-mandate angle

    FAIR assessment is increasingly cited explicitly in funder mandates. HORIZON Europe requires FAIR data management; NIH’s 2023 DMS policy uses FAIR language; UKRI references FAIR in its open-research statement. The mandates rarely specify how FAIR is to be assessed, which gives institutions latitude but also creates ambiguity at audit time.

    Our recommendation to institutions is to declare in your data policy which framework you use (the RDA Maturity Model is the safe choice as the common reference), which tooling you operate (F-UJI for automation, FAIR-Aware for researcher guidance, CoreTrustSeal for repository certification), and what your service-level commitment is to researchers depositing data. The reporting back to funders then has a documented basis.

    What to watch in 2026-2027

    The convergence work to watch is the FAIR Implementation Profiles (FIPs) approach, in which a community or institution declares its specific choices for each FAIR principle (which PID system, which metadata schema, which licence, which controlled vocabulary). FIPs are being piloted across EOSC and are likely to become the operational layer between the abstract FAIR principles and the per-dataset assessment. By 2027 we expect FAIR assessment tools to consume FIPs as configuration: “assess this dataset against the GO FAIR Life Sciences FIP” will be a meaningful operation.

    Related dictionary entries

    References

    Wilkinson et al., The FAIR Guiding Principles for scientific data management and stewardship (Scientific Data, 2016). RDA FAIR Data Maturity Model Working Group, final report (2020). Devaraju and Huber, F-UJI: An automated FAIR data assessment tool (FAIRsFAIR / PANGAEA, 2021). CoreTrustSeal Board, Trustworthy Data Repositories Requirements (current version). DANS, FAIR-Aware tool documentation.