Skip to main content
v2026.1714 entries · CC-BY 4.0

Datasets and software

DataCite contributorType and the CRediT cross-walk

DataCite's vocabulary for contributor roles on datasets and software overlaps with CRediT but is distinct. Practical guidance on which to use where, with a published cross-walk.

Two vocabularies, one ecosystem

DataCite is the DOI registry for research data and software, and its metadata schema has long carried a contributorType field that names the role a person or organisation played in creating a dataset. The vocabulary predates CRediT and reflects the data-stewardship lineage: values include DataCurator, Researcher, ProjectLeader, Sponsor, and Supervisor. The full list lives in the DataCite metadata schema reference.

CRediT, defined as ANSI/NISO Z39.104-2022, names 14 roles oriented to scholarly articles. The two vocabularies overlap in spirit — both describe granular contributor roles — but they are not directly interchangeable. A 2024 joint guide from Crossref and DataCite documents the relationship and the recommended mapping; see the DataCite blog post on the joint metadata guide.

Practical rule of thumb

For the rebuilt CASRAI site the recommendation is straightforward: use CRediT for journal articles and book chapters; use DataCite contributorType for datasets and software; cross-reference them where the same person plays both kinds of role on different outputs. Avoid translating CRediT roles into contributorType values silently — the loss of precision shows up downstream.

Where contributorType lives in a DataCite deposit

Inside the contributors element of a DataCite XML deposit, each contributor declares a contributorType attribute drawn from the fixed vocabulary. The contributor also carries name, name identifier (typically an ORCID iD), and affiliation in the usual way.

DataCite deposit, contributors block
xml
<contributors>
  <contributor contributorType="DataCurator">
    <contributorName nameType="Personal">Zhang, San</contributorName>
    <givenName>San</givenName>
    <familyName>Zhang</familyName>
    <nameIdentifier nameIdentifierScheme="ORCID"
                    schemeURI="https://orcid.org/">0000-0001-2345-6789</nameIdentifier>
    <affiliation affiliationIdentifier="https://ror.org/04abcd123"
                 affiliationIdentifierScheme="ROR">University of Example</affiliation>
  </contributor>

  <contributor contributorType="ProjectLeader">
    <contributorName nameType="Personal">Liu, Mei</contributorName>
    <givenName>Mei</givenName>
    <familyName>Liu</familyName>
    <nameIdentifier nameIdentifierScheme="ORCID"
                    schemeURI="https://orcid.org/">0000-0002-3456-7890</nameIdentifier>
  </contributor>
</contributors>

Indicative CRediT to contributorType cross-walk

The mapping below summarises the joint guide. It is indicative, not authoritative — deposit-time decisions should consult the upstream guide directly. Where no clean mapping exists, the recommendation is to choose the closest DataCite value and keep the CRediT URI on the linked article, not on the dataset record.

Indicative cross-walk
text
CRediT role                       DataCite contributorType
--------------------------------- ----------------------------
Data curation                     DataCurator
Investigation                     Researcher
Project administration            ProjectManager
Supervision                       Supervisor
Funding acquisition               Sponsor
Methodology                       Researcher (closest)
Software                          (no direct match — see DataCite "resourceType=Software")
Resources                         Other (with role narrative)
Formal analysis                   Researcher (closest)
Writing - original draft          (no direct match — narrative-only role)
Writing - review & editing        Editor (closest, depending on context)
Visualization                     Producer (closest)
Validation                        Researcher (closest)
Conceptualization                 Researcher (closest)

The blank cells matter: not every CRediT role has a sensible DataCite analogue. For software in particular, DataCite recommends using its resourceType="Software" alongside the relevant contributorType values rather than forcing CRediT's vocabulary onto a dataset deposit.

Where this matters operationally

  • Repositories minting DOIs for datasets and code. Use contributorType at deposit; do not invent CRediT-flavoured custom fields.
  • Publishers issuing a paper plus an underlying dataset. Two deposits, two vocabularies, linked via relatedIdentifier in the DataCite record and the Crossref deposit's relation block.
  • Institutional CRIS systems. Read both vocabularies; reconcile to a single internal model that preserves the distinction rather than collapsing it.

Related

Adopted by research universities worldwide

University of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoMassachusetts Institute of Technology logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoUniversity of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoMassachusetts Institute of Technology logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logo
  • University of Cambridge logo
  • Columbia University logo
  • University of Edinburgh logo
  • Harvard University logo
  • Massachusetts Institute of Technology logo
  • University of Oxford logo
  • Princeton University logo
  • Stanford School of Medicine logo
  • University College London logo

View CASRAI adoption →