Skip to main content
v2026.1714 entries · CC-BY 4.0
CASRAI

Definition · Plain-language

CRediT role: Data curation

In the CASRAI-originated CRediT taxonomy (ANSI/NISO Z39.104-2022), Data curation covers management activities to annotate, scrub and maintain research data — including software code — for initial use and later re-use.

CASRAI research-methods explainer — CRediT role: Data curation

The step most authors miss

Doing CRediT right? Don’t stop at the statement.

A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.

Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.

What Data curation means in CRediT

The ANSI/NISO Z39.104-2022 definition is: "Management activities to annotate (produce metadata), scrub data and maintain research data (including software code, where it is necessary for interpreting the data itself) for initial use and later re-use." This encompasses four main activities: producing metadata (describing what data exist and how they are structured); scrubbing data (cleaning, de-duplicating, correcting errors, harmonising formats); maintaining datasets over time (archiving, ensuring persistent access, responding to re-use queries); and curation of software code required to interpret or reproduce results.

Data curation and FAIR data principles

The FAIR data principles — Findable, Accessible, Interoperable, and Reusable — describe what well-curated data looks like. Data curation is the practical activity that makes data FAIR. Annotation with rich metadata (Findable); deposition in open repositories with persistent identifiers (Accessible); use of community-standard data formats and vocabularies (Interoperable); and clear licensing and provenance documentation (Reusable) are all curation activities. Researchers who invest substantial time in making datasets FAIR should receive Data curation credit — a recognition historically missing from authorship conventions.

The recognition gap

Before CRediT, data managers, research data librarians, bioinformaticians who curated genomic datasets, and statisticians who harmonised multi-site trial data all performed essential work that disappeared into acknowledgements or was not credited at all. This recognition gap affected career progression: without publication credits, data curators could not demonstrate research contribution in grant applications or promotion cases. CRediT's Data curation role directly addresses this gap, making it possible for data professionals to accumulate a verifiable record of contribution to published research.

Key facts

At a glance

  • Role definition: "Management activities to annotate, scrub data and maintain research data for initial use and later re-use"
  • Standard: ANSI/NISO Z39.104-2022, role 2 of 14 (alphabetical: D)
  • URI: casrai.org/credit/roles/data-curation
  • Includes: metadata production, data scrubbing, dataset maintenance, software curation
  • FAIR: data curation activities make datasets Findable, Accessible, Interoperable, Reusable
  • Recognition gap: previously invisible in author lists; CRediT makes it visible

Common misconceptions

What people often get wrong

Often heard: Data curation only applies to large datasets or bioinformatics.

Actually: Data curation applies to any research where annotating, cleaning or maintaining data is a distinct, substantive activity — from clinical trial data to qualitative interview transcripts to survey datasets.

Often heard: Data curation is the same as Data collection (Investigation).

Actually: Investigation covers collecting or gathering data (running experiments, recording observations). Data curation covers managing, annotating and maintaining data after collection — often a separate and substantial activity.

Often heard: Software developers always receive the Software role, not Data curation.

Actually: Where software code is required to interpret the data itself, maintaining that code is a data curation activity under ANSI/NISO Z39.104-2022. Researchers who maintain analysis pipelines for reproducibility may hold both Software and Data curation roles.

Common questions

FAQ

Should a research data librarian who curated a dataset receive Data curation credit?+

If the librarian's contribution was substantive — producing metadata, scrubbing data, depositing to a repository — and meets the criteria for authorship at the target journal, then yes. If they do not meet authorship criteria, acknowledge them in the acknowledgements section with a description of their contribution.

How does Data curation relate to CASRAI's work on research data management?+

CASRAI's broader work on research information standards (including the CERIF model and institutional data management frameworks) is complementary to the CRediT Data curation role. The role operationalises recognition of data management work at the article level; CASRAI's institutional standards address data management infrastructure at the organisational level.

LAC

Partner Deal

LAC Health Supplies Mobile App

Referenced across the research world

University of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logoUniversity of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logo
  • University of Cambridge logo
  • Columbia University logo
  • University of Edinburgh logo
  • Harvard University logo
  • University of Oxford logo
  • Princeton University logo
  • Stanford School of Medicine logo
  • University College London logo
  • ORCID logo
  • Crossref logo

View CASRAI adoption →