Skip to main content
v2026.1714 entries · CC-BY 4.0

Implementation checklistTrack A

Implementing the Research outputs (expanded) vocabulary

Repository managers, journal editorial offices, CRIS administrators, and researcher-profile-platform leads working with the expanded modern outputs taxonomy.

When to apply When the institution's output taxonomy needs to cover preprints, datasets, software, protocols, models, multimedia, and registered reports as first-class outputs — not as edge cases of the article record.

Before you start

Prerequisites

What needs to be in place before you operationalise Research outputs (expanded) terminology in your CRIS or repository.

  • A CRIS or repository capable of holding multiple Output sub-types with distinct metadata profiles
  • Familiarity with the COAR Controlled Vocabulary for Resource Types as a baseline alignment target
  • A DOI minting strategy that covers both Crossref (articles, preprints) and DataCite (datasets, software, other outputs)
  • A position on preprints — first-class outputs, with the linked published version as a relatedIdentifier (not a replacement)
  • A protocols.io or equivalent integration if the institution emphasises protocol publication

Deployment

Five steps to deploy

Each step is small enough to land in a single sprint or a single sitting with the relevant CRIS administrator. Follow in order.

  1. Adopt the COAR Controlled Vocabulary for Resource Types as the master output-type list

    COAR is the federation-friendly standard; map the local list to COAR codes even if your CRIS UI exposes simpler labels. This is what OpenAIRE, DataCite, and discovery layers expect.

  2. Define metadata sub-profiles per output type

    Article and preprint share most fields; dataset adds access classification and host; software adds repository URL, license, version, Software Heritage ID; protocol adds protocols.io DOI and step list; model adds the AI-ML-research-outputs sub-profile. Avoid forcing every sub-type into the article schema.

  3. Wire preprint-to-version-of-record linking

    When a preprint is later published as a journal article, the two records remain distinct; the article carries the preprint as relatedIdentifier with relationType="IsVersionOf" or "HasPart", and the preprint record updates with a "now published" notice and the article DOI.

  4. Add structured citation-network capture

    For each output, capture cites (outputs this work cites) and isCitedBy (outputs that cite this work) as relatedIdentifier links where the data is available from Crossref Event Data or OpenAlex. Surface the citation network on the public record.

  5. Pilot the expanded taxonomy on a high-volume department

    For one computer-science or biomedical department, configure all six expanded sub-types and run a calendar-year ingest. Measure: percentage of outputs that ended up correctly typed first time, depositor friction relative to the legacy "everything is an article" workflow.

Worked example

Sample workflow

A realistic walk-through of a single record passing through the Research outputs (expanded) pipeline once the checklist is in production.

A computational-biology group deposits a year's output. The expanded taxonomy distinguishes: a preprint at bioRxiv (Output sub-type=preprint, COAR resource type=preprint), the same work later published in a journal (Output sub-type=journal article, COAR resource type=journal article, with relatedIdentifier IsVersionOf the preprint), a Zenodo-hosted dataset (Output sub-type=dataset, COAR resource type=dataset, DataCite DOI), the analysis pipeline as software (Output sub-type=software, COAR resource type=software, Software Heritage ID, Zenodo release DOI, MIT license), a protocols.io protocol (Output sub-type=protocol, protocols.io DOI), and a 12-minute methods explainer video (Output sub-type=video, COAR resource type=audiovisual). Each sub-type uses its appropriate metadata profile. On the public researcher-profile platform, the outputs surface in a single timeline; on the institution's OpenAIRE-harvested feed, each appears with the correct COAR resource type so downstream filtering by output type just works.

Integration points

CRIS and repository systems

Vendor-specific notes on where this vocabulary fits in real research-information systems. Names appear here only where there is public field evidence — they are not vendor partnerships.

Pure (Elsevier)

Native multi-sub-type output handling; COAR mapping configurable. Pair with the institutional repository for file storage.

Symplectic Elements

Strong publication sub-type handling; the Elements Reporting Database can roll up across sub-types for institutional reporting.

DSpace 8.x

Configurable-entities supports multiple output sub-types with custom metadata schemas; community-and-collection structure can reflect the sub-type taxonomy directly.

Zenodo / Invenio-RDM

Strong fit for datasets, software, multimedia, and preprints; supports DataCite DOI minting and the COAR resource-type vocabulary.

OpenAIRE Research Graph

Federation target — once outputs are typed via COAR and linked via relatedIdentifier, the OpenAIRE graph reconstructs the full research-output network.

What goes wrong in the field

Common pitfalls

The patterns that show up repeatedly when this checklist is skipped or misapplied. Address these before they become entrenched.

  • Forcing datasets and software into the article-shaped metadata schema and losing access-classification, license, and version-specific fields
  • Treating preprints as drafts to be replaced by the article record, rather than as first-class outputs in their own right
  • Skipping the COAR resource-type mapping and producing local-only output-type labels that do not federate
  • Failing to capture relatedIdentifier links between preprints, articles, datasets, and software, so the citation network cannot be reconstructed
  • Letting the software output type collapse into "code attached as supplementary" instead of a real Output with its own DOI, license, and SWHID

Frequently asked

Implementation FAQ

Who maintains this checklist?
The Research outputs (expanded) working group maintains the checklist alongside the dictionary terms in the same domain. It is reviewed each release cycle (March and September) and updated when a working-group consultation, a vendor product change, or a federation-partner schema update materially changes the operational guidance.
What if my CRIS or repository is not listed?
The integration points listed name the systems CASRAI has direct field experience with — Pure, Symplectic Elements, Worktribe, Converis, DSpace and DSpace-CRIS, EPrints, VIVO, Dataverse, Invenio-RDM. The CERIF mapping in the checklist is vendor-neutral and applies equally to other CRIS or repository products. If your system supports the underlying entities (Person, Project, Output, Funding, plus the domain-specific extensions), the steps transfer.
How do I validate my implementation?
Three validation surfaces. First, the deposit form should refuse a record missing required fields rather than warn and accept. Second, the resulting metadata should round-trip through the federation layer your institution uses (OpenAIRE Guidelines 4.0 for European federation, DataCite Commons for DOI-anchored discovery, Crossref for article-anchored discovery) without upstream errors. Third, walk a real-world record through the sample-workflow path on this page and confirm the structured fields capture what the prose describes.
Where do I report errors in the checklist?
Open a comment via the dictionary-feedback flow at /dictionary/contribute. Editorial corrections — wrong vendor module names, deprecated standards, broken integration paths — are queued into the next release cycle. Substantive disagreements on the operational guidance are routed to the working group for review and may motivate a checklist revision.
Is this checklist enough to certify my implementation?
No. The checklist gives you the operational baseline; certification against federation profiles (CoreTrustSeal, OpenAIRE-compliant, COAR-aligned) is a separate process with its own audit. Treat the checklist as the engineering scaffolding and the certification as the institutional sign-off that the scaffolding is being used.

Adopted by research universities worldwide

University of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoMassachusetts Institute of Technology logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoUniversity of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoMassachusetts Institute of Technology logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logo
  • University of Cambridge logo
  • Columbia University logo
  • University of Edinburgh logo
  • Harvard University logo
  • Massachusetts Institute of Technology logo
  • University of Oxford logo
  • Princeton University logo
  • Stanford School of Medicine logo
  • University College London logo

View CASRAI adoption →