Definition · Plain-language

Research Data Management (RDM)

Research data management (RDM) is the active organisation, storage, documentation, sharing and long-term preservation of data generated or collected during research — covering the full data lifecycle from planning through re-use.

The step most authors miss

Doing CRediT right? Don’t stop at the statement.

A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.

Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.

The research data lifecycle

RDM frameworks use a lifecycle model to structure responsibilities across the life of a research project. Common phases include: Plan (write a data management plan, identify formats and repositories); Collect (gather, generate or acquire data, apply quality checks); Process and analyse (clean, transform, analyse, version-control); Preserve (deposit in an appropriate repository with documentation and metadata); Share (make data available with a licence and persistent identifier); and Re-use (discover and cite others' data). The lifecycle is iterative rather than linear: archival practices feed back into planning for future projects, and published data enters re-use cycles. Institutions and funders map RDM responsibilities onto these phases to assign support and compliance obligations.

Metadata, documentation and attribution

RDM depends on rich documentation. Metadata — structured information describing the data — enables discovery, interoperability and reuse. Disciplinary metadata standards such as DataCite Metadata Schema 4.x for general datasets, Dublin Core for cross-domain discovery, Darwin Core for ecological data and DICOM for medical imaging provide consistent, machine-readable descriptions. Alongside metadata, data documentation such as README files, codebooks and data dictionaries provides the human-readable context necessary for a second researcher to understand and re-use data without direct contact with the original team. ORCID identifiers connect researchers to the datasets they created, enabling attribution and recognition for data sharing, particularly as funders and publishers move towards crediting data publication alongside journal articles.

Institutional support and funder requirements

Universities and research institutions provide RDM support through library and research office services: training, consultancy, storage infrastructure, repository access and DMP review. The UK DCC (Digital Curation Centre), DANS in the Netherlands and research data management organisations globally develop standards, tools and training resources. Funder requirements are the primary driver of RDM adoption: NIH's 2023 DMS Policy, Wellcome Trust's data policies, UKRI requirements and Horizon Europe mandates collectively require that researchers plan, document and share their data. Version control systems (Git, DVC) support data provenance during analysis; data repositories (Zenodo, Figshare, Dryad, Dataverse, institutional repositories) provide certified long-term archival.

Key facts

At a glance

Definition: the organisation, documentation, storage, sharing and preservation of research data across its lifecycle
Lifecycle phases: plan, collect, process, analyse, preserve, share, re-use
Institutional support: provided by libraries, research offices and dedicated RDM services
Funder mandate: NIH (2023 DMS Policy), UKRI, Wellcome, Horizon Europe all require DMPs
Metadata standards: DataCite, Dublin Core, Darwin Core, DICOM (domain-specific)
Attribution: ORCID connects researchers to published datasets; data citation enables credit
Repositories: Zenodo, Figshare, Dryad, Dataverse, institutional repositories

Common misconceptions

What people often get wrong

Often heard: Research data management just means backing up files to a hard drive.

Actually: RDM covers the full lifecycle: planning collection, applying metadata, organising and documenting data, version control, long-term archival in certified repositories, licensing for re-use and proper citation. Storage is one component, not the whole.

Often heard: RDM is only relevant for quantitative or "big data" research.

Actually: RDM applies to all research types, including qualitative, humanities and mixed-methods studies. Interview transcripts, field notes, audio recordings and survey instruments all require management, documentation, secure storage and appropriate sharing or confidentiality arrangements.

Often heard: Sharing data means giving away intellectual property and losing competitive advantage.

Actually: Data sharing under a clear licence preserves the researcher's or institution's rights. Embargo periods allow publications to appear before data is released. Funders generally require sharing "as open as possible, as closed as necessary", explicitly allowing restrictions for sensitivity, commercial value or ongoing analysis.

Common questions

FAQ

What is the difference between research data management and a data management plan?+

A data management plan (DMP) is a document produced at the start of a project describing how RDM will be carried out. Research data management is the actual practice throughout the project. The DMP is the plan; RDM is the ongoing activity of following and updating that plan.

What counts as research data?+

Research data includes any recorded information collected or generated during a research project to validate findings: numerical measurements, survey responses, interview transcripts, images, simulation outputs, code, laboratory notebooks, field notes and more. The precise definition varies by funder and discipline.

Going deeper