Category: Guides & Explainers

Practical how-to guides, templates, checklists, and career pathways for research administrators, authors, and institutional teams.

  • CRediT Contributor Roles Taxonomy Example: A 5-Author, Multi-Site Study Walkthrough

    A credit contributor roles taxonomy example works best as a full worked matrix: all 14 CRediT roles mapped against every named contributor, so that overlapping statistical, clinical, and writing work on a multi-author study becomes explicit rather than assumed from author order. This article builds that matrix, role by role, for a hypothetical five-author, three-site trial.

    CRediT (the Contributor Roles Taxonomy) is a fourteen-role controlled vocabulary for describing the specific type of contribution each named contributor made to a research output, independent of author order or seniority. CASRAI originated CRediT in 2014; the taxonomy is now formally stewarded by NISO as ANSI/NISO Z39.104-2022, approved in 2022 and licensed CC-BY 4.0 for free reuse by any publisher, funder, or institution.

    What is the CRediT contributor roles taxonomy?

    The CRediT contributor roles taxonomy lists fourteen discrete role types: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, and Writing – review & editing. Any contributor can hold multiple roles, and any role can be shared by multiple contributors.

    Under the NISO standard, each shared role can optionally carry a degree-of-contribution qualifier:

    • Lead — this person did most of the work for that role
    • Equal — contribution was shared roughly evenly with named co-contributors
    • Supporting — a secondary, assisting contribution to that role

    These qualifiers are what make a worked example useful: a bare list of role names tells a reader little, but a role assigned “Lead” versus “Supporting” against a specific name tells them exactly how the work divided.

    The hypothetical study: a five-author, three-site trial

    To make the taxonomy concrete, consider a hypothetical trial: “Effects of a community-based exercise programme on cardiometabolic risk markers,” run across three sites — a lead university, a partner university running local recruitment, and an NHS trust providing the clinical setting. Five people are named as contributors:

    • Dr Amara Osei — Chief Investigator, lead university
    • Dr Rhys Bevan — Co-investigator and site lead, partner university
    • Dr Priya Nair — Biostatistician, lead university
    • Fatima Choudhury — Research nurse and clinical trial coordinator, NHS trust site
    • Dr Tomasz Wolski — Postdoctoral researcher, lead university

    This spread is deliberately realistic: it mirrors the multi-site, mixed-role structure of a typical funded clinical or field trial, where no single person can plausibly claim every contribution, and where author contributions examples published in journals routinely span exactly this kind of team.

    Role-by-role: assigning all 14 CRediT roles

    Working through each role in turn, rather than starting from “who is first author,” keeps the exercise honest. Below is the completed matrix for this hypothetical team.

    CRediT role Osei (CI) Bevan (Co-I) Nair (Statistician) Choudhury (Nurse/Coordinator) Wolski (Postdoc)
    Conceptualization Lead Supporting
    Data curation Equal Equal
    Formal analysis Lead Supporting
    Funding acquisition Lead
    Investigation Equal Lead
    Methodology Supporting Lead
    Project administration Lead Supporting
    Resources Lead
    Software Lead
    Supervision Lead
    Validation Lead
    Visualization Lead Supporting
    Writing – original draft Lead
    Writing – review & editing Equal Equal Equal

    Reading the matrix

    Three things stand out that a title-only author list would hide. First, Dr Nair, the biostatistician, holds five roles (Formal analysis, Software, Validation, Visualization, and a shared Data curation) despite not being first or corresponding author. Second, Fatima Choudhury — a research nurse, not a doctoral-level academic — leads Investigation and Resources, reflecting that she ran the clinical site day-to-day. Third, no single person leads more than four roles; the workload is genuinely distributed across the three sites, which is precisely the pattern credit contributor roles taxonomy assignment is designed to surface.

    Writing the published CRediT statement

    Once the matrix is agreed, it converts directly into the “Author Contributions” text that journals such as Elsevier, Wiley, and Taylor & Francis titles require at submission:

    “Amara Osei: Conceptualization, Funding acquisition, Supervision, Methodology (supporting), Writing – review & editing (equal). Rhys Bevan: Methodology (lead), Investigation (equal), Project administration (lead), Writing – review & editing (equal). Priya Nair: Formal analysis, Software, Validation, Visualization, Data curation (equal). Fatima Choudhury: Investigation (lead), Resources, Data curation (equal), Project administration (supporting). Tomasz Wolski: Writing – original draft, Conceptualization (supporting), Formal analysis (supporting), Visualization (supporting), Writing – review & editing (equal).”

    This is a genuine statement of contribution example built directly from the matrix above — nothing in it needs to be reverse-engineered from a vague sentence like “all authors contributed equally,” which contributes no verifiable information at all.

    Common questions about CRediT contributor roles

    What is CRediT contributor role taxonomy?

    CRediT is a standardised, fourteen-role vocabulary for describing what each named contributor actually did on a research output, rather than relying on author position alone. It was originated by CASRAI in 2014 and is now formalised as ANSI/NISO Z39.104-2022, used across most major scholarly publishers at submission.

    What are the 14 CRediT contributor roles?

    The fourteen roles are Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, and Writing – review & editing. Multiple contributors can share any single role, each optionally marked lead, equal, or supporting.

    How do you write a contributorship statement?

    List every named contributor, then assign each of the fourteen CRediT roles they actually performed, using degree-of-contribution qualifiers where a role is shared. Agree the matrix among all co-authors before submission — the ICMJE and COPE both flag late, undiscussed contributorship claims as a common source of authorship disputes.

    In what order should authors be listed?

    Author order is a separate decision from CRediT roles and typically reflects relative overall contribution, with the corresponding author (often, but not always, first or last) taking responsibility for the submission. CRediT does not replace author order — it supplements it with role-level transparency that order alone cannot convey.

    Implications for multi-site studies — and what comes next

    Multi-site teams like the hypothetical trial above create a specific governance risk: contributions made at a partner site or NHS trust are structurally easy to under-credit if roles are assigned only by the lead institution after the fact. Building the matrix role-by-role, rather than writing a summary sentence, forces every site’s actual work — clinical coordination, statistical modelling, field recruitment — into the open before submission.

    For research offices and institutional repositories, a completed CRediT matrix is also increasingly machine-readable output metadata: DataCite and CrossRef schemas can carry contributor roles alongside ORCID iDs, feeding directly into research information systems without re-keying. As more funders request contributor-level reporting alongside authorship criteria, teams that build the habit of completing a full role matrix — not just a name list — will find compliance largely already done. Institutions building their own role-assignment workflows can start from the individual role definitions to check edge cases the matrix above does not cover.

  • Author Contribution Statement for Case Reports

    An author contribution statement example for a case report should list only the roles that genuinely apply to one or two authors — typically conceptualisation, investigation, and writing — rather than force-fitting all fourteen CRediT categories built for large research teams. For a sole author, a single sentence confirming full responsibility across the applicable roles satisfies both journal policy and ICMJE authorship criteria.

    An author contribution statement is a short, published declaration — separate from the acknowledgements — that specifies which named author performed which part of the research and writing. Below is a practical, minimal-author template for case reports, built around the taxonomy’s actual scope rather than a mechanical checklist.

    What is an author contribution statement, and why do case reports struggle with it?

    An author contribution statement is a brief, structured account — usually one to three sentences per author — of who conceived, conducted, and wrote a published work. CASRAI originated the CRediT contributor role taxonomy in 2014, and the standard is now stewarded by NISO as ANSI/NISO Z39.104-2022, defining fourteen discrete contributor roles: Conceptualization, Data Curation, Formal Analysis, Funding Acquisition, Investigation, Methodology, Project Administration, Resources, Software, Supervision, Validation, Visualization, Writing – Original Draft, and Writing – Review & Editing.

    The taxonomy was designed for multi-author, multi-institution collaborations where credit disputes and hidden labour are real risks. A single-author case report has no such dispute to resolve — one person, by definition, performed every applicable role. Forcing all fourteen categories onto one or two names produces a statement that reads as padding rather than disclosure, which is precisely the awkward fit this template addresses.

    How do you write a single-author case report contribution statement?

    For a sole-author case report, the statement should confirm that the author meets the ICMJE authorship criteria in full, without listing categories that plainly do not apply (Software, Funding Acquisition, and Project Administration are the ones most often irrelevant to a single clinical case). The International Committee of Medical Journal Editors requires that every listed author:

    • Made a substantial contribution to the conception, design, acquisition, analysis, or interpretation of the case;
    • Drafted the work or revised it critically for important intellectual content;
    • Approved the final version for publication; and
    • Agreed to be accountable for all aspects of the work’s accuracy and integrity.

    A minimal, publication-ready example: “The author conceived the case report, collected and interpreted the clinical data, drafted the manuscript, and approved the final version for submission.” A CRediT-tagged variant works equally well: “Author Name: Conceptualization, Investigation, Writing – Original Draft, Writing – Review & Editing.” Both versions satisfy journal policy; the second is preferable where the target journal explicitly asks for CRediT-labelled statements rather than free text.

    How do you split CRediT roles between two authors in a case report?

    With two authors — commonly a treating clinician and a co-author handling the literature review or write-up — the statement should separate clinical-care roles from writing roles rather than duplicating the full taxonomy for each name. This keeps the statement honest: a supervising consultant who reviewed but did not draft the manuscript should not appear under Writing – Original Draft.

    CRediT role Typical applicability to a case report Notes
    Conceptualization Applies Identifying the case as reportable
    Investigation Applies Clinical assessment, data gathering
    Writing – Original Draft Applies Usually one named drafting author
    Writing – Review & Editing Applies Supervising or co-author input
    Supervision Rarely applies Only where a senior author directed the case work
    Validation Rarely applies Relevant only if data required independent checking
    Data Curation Rarely applies Usually not distinct from Investigation in a case report
    Software, Funding Acquisition, Project Administration, Resources, Formal Analysis, Visualization, Methodology Usually N/A Omit rather than force-fit for a single case

    Example two-author statement: “Dr A managed the patient, conceived the report, and revised the manuscript critically. Dr B conducted the literature review and drafted the manuscript. Both authors approved the final version and agree to be accountable for its accuracy.” Where a journal mandates CRediT labels specifically, the equivalent tagged form is: “Dr A: Conceptualization, Supervision, Writing – Review & Editing. Dr B: Investigation, Writing – Original Draft.”

    Which journals require this, and in what format?

    Requirements vary by publisher, and case reports are frequently held to the same policy as full research articles even though the taxonomy was not built with them in mind. Elsevier requires a CRediT author statement for all research articles, including case reports, under its published CRediT author statement policy. JMIR treats the Authors’ Contributions section as optional but recommended, per guidance updated by JMIR Publications on 2 February 2026, while Springer/Nature journals commonly request a free-text statement such as “all authors contributed to the study conception and design,” without mandating the full fourteen-role CRediT format.

    Publisher / body Statement required? Format
    Elsevier Mandatory CRediT-tagged roles, degree-of-contribution optional
    Springer / Nature Mandatory (most journals) Free-text narrative statement
    JMIR Optional but recommended Free-text narrative statement
    ICMJE (cross-publisher baseline) Recommended policy, not a form Four-criteria authorship test

    The American Astronomical Society’s journals took the free-text route deliberately: when AASTeX v7.0 introduced Author Contribution sections, the society specified a free-form field “rather than a formulaic set of checkboxes,” precisely because a rigid taxonomy poorly serves papers with unusual author configurations — a principle that extends directly to minimal-author case reports.

    Common questions on author contribution statements

    How to write an author contribution in a case report?

    State each named author’s role using plain, active verbs — conceived, collected, drafted, revised, approved — rather than the full CRediT list. Confirm every author meets all four ICMJE criteria; anyone who does not should move to the acknowledgements instead of the byline.

    How do you write an author’s contribution statement?

    Identify what each author actually did across conception, data work, drafting, and approval, then write one sentence per author naming those tasks. Use either free text or CRediT-tagged roles depending on the target journal’s house style, and have every author confirm the wording before submission.

    What are examples of author contributions?

    Common contribution categories include conceiving the study, acquiring or analysing data, drafting the manuscript, critically revising it, and supervising the work. The CRediT taxonomy formalises fourteen such categories, but a case report typically draws on only three or four of them.

    What is a contribution statement example?

    A minimal example: “The author conceived the case, gathered clinical data, drafted the manuscript, and approved the final version.” This single sentence satisfies ICMJE’s authorship test and works for any single-author case report regardless of specialty.

    What this means for case report authors and editors

    Journals and editorial offices reviewing minimal-author submissions should stop asking authors to populate all fourteen CRediT fields by default. A short, honest, ICMJE-aligned narrative — or a CRediT statement limited to the roles that genuinely applied — better serves both transparency and author time than a taxonomy stretched past its design case. Editors adopting free-text options, as AAS Journals did for astrophysics collaborations of any size, give case report authors a route that neither omits required disclosure nor manufactures roles that were never performed.

    As more publishers formalise contribution statements as a submission requirement rather than an optional courtesy, case report authors gain most by keeping the statement proportional: name every applicable role, omit the rest, and confirm ICMJE accountability explicitly rather than by implication.

  • Author Contribution Statement Examples in Review Articles

    Not all 14 CRediT roles apply to a review article. When a manuscript synthesises existing literature rather than collecting primary data, roles built around experiments, materials and datasets — Investigation, Resources, Data Curation — rarely fit, while Conceptualization, Methodology, Formal Analysis, Visualization and both Writing roles almost always do. An author contribution statement example review article authors can adapt should map contributions to the roles the review actually required, not force every author into a role designed for empirical research.

    The Contributor Roles Taxonomy (CRediT) is a fourteen-role classification system used to describe, in a standardised author contribution statement, exactly what each named author did on a published work. CASRAI originated CRediT in 2014 as a response to opaque, order-of-authorship-only bylines; the taxonomy is now stewarded by NISO as ANSI/NISO Z39.104-2022, with the current definitions maintained at credit.niso.org.

    Which CRediT roles actually apply to a review article?

    Seven to nine of the fourteen CRediT roles map cleanly onto review-article work. Conceptualization covers who framed the review question and scope — always relevant, since every review starts from a defined aim. Methodology covers the design of the search strategy, inclusion/exclusion criteria and, for systematic reviews, the registered protocol.

    Formal Analysis applies wherever authors synthesise findings — statistically in a meta-analysis, thematically in a narrative review. Visualization covers PRISMA flow diagrams, forest plots and summary tables, which most reviews include. Writing – Original Draft and Writing – Review & Editing apply to every author who meets ICMJE’s drafting-or-revising criterion. Supervision, Project Administration and Funding Acquisition apply exactly as they would on any funded, multi-author output.

    Which roles rarely apply when there’s no primary data collection?

    Resources and Data Curation were written for empirical studies: provision of reagents, patients, instrumentation, or management of a generated dataset. A review that only reads and synthesises published sources produces no such materials, so these roles should usually be omitted rather than stretched.

    Software only applies if authors built bespoke code — for example a custom R script for a meta-analysis — not for using standard reference-management tools. Validation, defined by NISO as verifying reproducibility of results or experiments, has no primary experiment to verify in most narrative reviews, though it can legitimately apply to a systematic review’s dual-reviewer screening check.

    Investigation is the most commonly misapplied role in review contribution statements. NISO’s definition ties it to “performing the experiments, or data/evidence collection” — some editors accept that a systematic literature search and screening process counts as evidence collection, while others reserve Investigation strictly for primary data gathering. Because guidance is inconsistent across publishers, review teams should state explicitly what “Investigation” covers in their statement rather than assume a shared reading.

    CRediT role Typical fit for a review article Note
    Conceptualization Applies Framing the review question and aims
    Methodology Applies Search strategy, protocol, screening criteria
    Investigation Contested Literature search sometimes counted, sometimes not
    Formal Analysis Applies Statistical or thematic synthesis
    Data Curation Rarely applies No generated dataset in most reviews
    Resources Rarely applies No materials, patients or instrumentation
    Software Rarely applies Only if bespoke analysis code was built
    Validation Rarely applies Occasional fit for dual-reviewer screening checks
    Visualization Applies PRISMA diagrams, forest plots, summary tables
    Writing – Original Draft Applies Always, for drafting authors
    Writing – Review & Editing Applies Always, for revising authors
    Supervision Applies Senior-author oversight
    Project Administration Applies Coordinating multi-reviewer teams
    Funding Acquisition Applies If the review was funded

    Does it differ between narrative and systematic reviews?

    Yes. A systematic review generates far more CRediT-relevant activity than a narrative review because it follows a documented protocol. Formal database searching, dual-reviewer screening, a PRISMA flow diagram and, often, a meta-analysis all create genuine Methodology, Formal Analysis and Visualization contributions.

    A narrative review, by contrast, typically compresses most of the work into Conceptualization and the two Writing roles, since there is no registered protocol or formal extraction process to document separately. Authors of narrative reviews should resist copying a systematic-review template wholesale — an author contribution statement that lists Investigation, Validation and Data Curation for a narrative review with no protocol will look inflated to an editor who knows the difference.

    How do you write the statement itself?

    Springer Nature’s author instructions explicitly accommodate reviews: where “discrete statements are less applicable,” the statement should still identify who had the idea for the article and who performed the literature search, even without a full role-by-role breakdown. JMIR’s author guidance is more direct: “Some roles won’t apply – each research output is different; if specific CRediT roles are not relevant to a particular output, they do not need to be included.”

    A practical three-author example for a systematic review:

    • Conceptualization: A.B. (lead), C.D. (equal)
    • Methodology: A.B., C.D.
    • Formal Analysis: E.F.
    • Visualization: E.F. (lead), A.B. (supporting)
    • Writing – Original Draft: A.B. (lead), C.D. (supporting)
    • Writing – Review & Editing: A.B., C.D., E.F.
    • Supervision: A.B.

    Note what is absent: no Data Curation, Resources, Software or Validation, because none occurred. Under ICMJE’s authorship criteria, every named author must still meet all four conditions — substantial contribution, drafting or revising, final approval, and accountability — regardless of which CRediT roles they are assigned.

    Common questions about author contribution statements

    What is a contribution statement example?

    A contribution statement lists each author’s initials against the specific CRediT roles they performed, such as “A.B.: Conceptualization, Writing – Original Draft; C.D.: Formal Analysis, Writing – Review & Editing.” It replaces vague author-order assumptions with an explicit, auditable record.

    What is the author contribution statement in Springer?

    Springer Nature requires a statement of responsibility in every manuscript, including review-type articles, specifying each author’s contribution. For reviews where a full role-by-role breakdown does not fit, Springer still expects the statement to name who conceived the article and who conducted the literature search.

    How to write an author contribution statement?

    List every author’s initials, then attach the CRediT roles that genuinely apply to their work on that specific manuscript, omitting roles that do not apply rather than padding the list. Corresponding authors are responsible for confirming the statement with every co-author before submission.

    What should substantial contributions include to be credited as an author?

    Per ICMJE, substantial contribution means conception or design, or acquisition/analysis/interpretation of data, combined with drafting or critically revising the work, final approval, and accountability for its accuracy. Meeting only one criterion, such as literature searching alone, does not by itself satisfy authorship requirements.

    What this means for review authors and editors

    Review teams that copy a data-heavy CRediT template wholesale risk two failure modes: omitting genuine synthesis work under vague “Writing” credit, or inflating the statement with roles like Investigation and Data Curation that a careful editor will question. The more defensible approach is to start from the fourteen roles, keep the seven or eight that genuinely occurred, and state plainly — as JMIR’s guidance recommends — that the rest were not applicable to this output.

    As more publishers formalise CRediT for review-type manuscripts under ANSI/NISO Z39.104-2022, expect journal instructions to increasingly distinguish narrative from systematic reviews in their contribution-statement guidance, closing the ambiguity that currently surrounds roles like Investigation. Until then, the safest practice for review authors is explicit scoping: name what each role means in this specific manuscript, rather than relying on definitions written for laboratory-based research.

  • Author Contribution Statement Frontiers Guide: What Open Peer Review Changes

    An author contribution statement for Frontiers is a mandatory, standardised disclosure — built on the CRediT taxonomy — that names each author’s initials against specific research tasks, placed just before the references. Because Frontiers also operates a collaborative, open peer review model in which reviewer identities are published alongside the article, that statement sits inside a visibly transparent record rather than behind a closed editorial process, raising the stakes for accuracy and completeness compared with journals that keep review closed.

    The Contributor Roles Taxonomy (CRediT) is a structured set of 14 standardised labels — from Conceptualization to Writing – Review & Editing — used to describe what each named author actually did on a manuscript, replacing vague free-text authorship blurbs with a checkable, comparable record.

    What does Frontiers require in an author contribution statement?

    Frontiers’ author guidelines make the Author Contributions Statement mandatory for every submission across its journal portfolio, including titles operated under Frontiers Partnerships. The statement must represent all named authors, briefly describe individual tasks, and identify each person by initials rather than full names — with a middle initial added where two authors share the same first and last initials (for example, REW and RSW).

    Practically, the submitting author enters each co-author’s contributions during the online submission process, and the system compiles them into the final statement, which is placed at the end of the manuscript, immediately before the References section. This mirrors the broader shift documented by publishers such as Elsevier and Wiley toward structured, submission-system-driven contribution capture rather than a free-text paragraph drafted after the fact.

    Frontiers’ authorship threshold is explicitly anchored to the International Committee of Medical Journal Editors (ICMJE) criteria: substantial contribution to conception or design, data acquisition, analysis or interpretation; drafting or critically revising the work; final approval of the version to be published; and agreement to be accountable for all aspects of the work. A CRediT-tagged contribution statement does not replace this authorship test — it documents what qualifying authors did, once they already qualify.

    What is CRediT, and where did it come from?

    CASRAI originated the CRediT contributor role taxonomy in 2014, in collaboration with journal publishers and research funders seeking a shared vocabulary for describing authorship work. The standard is now stewarded by NISO as ANSI/NISO Z39.104-2022, which is the current authoritative specification of the 14 roles and their definitions.

    Frontiers announced its adoption of CRediT on 20 July 2023, stating that the system “replaces the conventional free-text authorship descriptions with a standardized and transparent system that ensures consistency and accuracy in recognizing individual contributions.” Frontiers’ chief executive editor, Dr Frederick Fenter, framed the move as part of a wider commitment to openness within scholarly publishing.

    • Conceptualization
    • Data Curation
    • Formal Analysis
    • Funding Acquisition
    • Investigation
    • Methodology
    • Project Administration
    • Resources
    • Software
    • Supervision
    • Validation
    • Visualization
    • Writing – Original Draft
    • Writing – Review & Editing

    Each role can be assigned to more than one author, and a single author can hold multiple roles — the taxonomy is designed to reflect real research teams, where contributions overlap rather than divide neatly by job title.

    How does Frontiers’ open peer review model change the stakes?

    Frontiers runs a collaborative review process in which reviewers interact directly with authors during revision and reviewer names are published on the final article. That design choice matters for contribution statements: in a closed-review journal, an inaccurate or vague CRediT statement is checked, at most, by an anonymous editor and reviewers whose identities never surface. At Frontiers, the same statement sits on a page where the reviewers who scrutinised the work are named too, creating a fuller, mutually visible accountability chain from idea to publication.

    This does not mean reviewers audit CRediT tags line by line — Frontiers’ policy places that responsibility on the corresponding author — but it does mean the entire provenance record (who contributed what, and who reviewed it) is public and durable rather than partially hidden. For research integrity investigations, that visibility is a practical asset: a named reviewer trail alongside a role-based authorship record narrows the anonymity gap that closed models leave open.

    Feature Traditional closed peer review Frontiers’ collaborative open review
    Reviewer identity Anonymous to readers (and often to authors) Published with the article
    Author contribution statement Visible to readers, but reviewed only by an anonymous editor Visible to readers alongside named reviewers who assessed the work
    Post-publication scrutiny Contribution disputes are harder to trace to a specific review stage Named reviewer record supports faster provenance checks
    Incentive for precision Lower — statement rarely cross-checked publicly Higher — statement sits next to a public, named review record

    For research administrators advising on authorship disputes, this distinction is worth flagging explicitly: a Frontiers submission carries more public accountability infrastructure around a contribution statement than an equivalent closed-review journal, even though the CRediT taxonomy itself is identical across both.

    What does a compliant example look like?

    A CRediT-based Frontiers statement is typically compact — a handful of sentences, not a paragraph — and uses initials throughout. A representative, compliant format:

    “AB: Conceptualization, Methodology, Writing – Original Draft. CD: Investigation, Formal Analysis, Visualization. EF: Data Curation, Software. GH: Supervision, Funding Acquisition. All authors contributed to the article and approved the submitted version.”

    Three points distinguish a compliant statement from a weak one:

    • Every named author appears at least once — omitting a listed author from the statement is a common submission-checklist rejection reason.
    • Roles are drawn from the 14 standard CRediT labels, not invented descriptions (“helped with the project” is not a CRediT role).
    • The closing sentence confirming collective approval is retained, satisfying the ICMJE’s fourth authorship criterion on accountability.

    Common questions

    What is a contribution statement example?

    A contribution statement example lists each author’s initials against specific CRediT roles, such as “AB: Conceptualization, Writing – Original Draft.” It is a short, structured disclosure — typically two to five sentences — not a narrative account, and it appears at the end of the manuscript before the references.

    How do I write an author contribution statement?

    Assign each named author one or more of the 14 CRediT roles based on what they actually did, list contributions by initials, and add a closing line confirming all authors approved the submitted version. Frontiers’ online submission system compiles these entries automatically once authors provide them.

    Do you have to pay to publish in Frontiers?

    Yes — Frontiers is a gold open-access publisher and charges an article processing charge (APC) only after acceptance; no fee applies to rejected or withdrawn submissions. This fee transparency sits alongside the same openness principle that drives Frontiers’ published reviewer names and public contribution statements.

    Implications for authors and institutions

    Research offices advising authors on Frontiers submissions should treat the contribution statement as a document with two audiences at once: the editorial system checking ICMJE compliance, and a permanent public record sitting next to named reviewers. According to Frontiers Media’s own reporting on the Norwegian Scientific Index (NSD), 96 of its journals were listed in that register as of 2022 — a scale of output where standardised, auditable contribution data materially reduces the administrative burden of resolving authorship disputes after publication.

    Institutions building CRediT literacy into researcher training should note that the taxonomy’s value compounds under open models: a precise, role-based statement becomes machine-readable metadata that can feed ORCID records, funder reporting, and institutional repositories, not just a line in a PDF.

    Where this is heading

    As more publishers combine structured contributorship data with visible review provenance, the author contribution statement stops being a compliance formality and becomes part of a public integrity record. Frontiers’ pairing of mandatory CRediT statements with named, published reviewers is one live example of that shift — and a template other open-review adopters are likely to follow as funders and institutions push for fuller contributorship transparency.

    For the full 14-role reference and role definitions, see the CRediT taxonomy overview and the individual CRediT role pages. For the underlying authorship criteria that a contribution statement documents, see CASRAI’s authorship guidance.

  • Research Data Management Training: 4 Routes

    Research data management training covers the courses, workshops and certifications that prepare staff to plan, organise, preserve and share research data throughout its lifecycle. For someone moving into a data-steward or research-data-manager role, the realistic routes fall into four groups: institutional workshops, free self-paced courses such as MANTRA, the international CODATA-RDA Schools of Research Data Science, and formal professional certifications.

    Research data management (RDM) is the set of practices that govern how research data is created, documented, stored, preserved and shared so that it remains accurate, findable and reusable. There is no single mandatory qualification for the role; instead, institutions and funders recognise a mix of short courses, free online training and professional certifications as evidence of competence.

    What is research data management training?

    Research data management training teaches the practical skills behind the data curation lifecycle: planning, documentation, storage, preservation and sharing. It is aimed at researchers, librarians, IT staff and administrators who take on data-support responsibilities, not only at specialists with “data steward” already in their job title.

    Training providers range from single-day institutional workshops to multi-week international schools and multi-year professional certification pathways. The right choice depends on career stage, whether the goal is a specific role, and whether an employer or funder requires a recognised credential.

    Institutional workshops and short courses

    University-run workshops are the most common entry point. The Digital Curation Centre (DCC) runs a full-day Principles of Research Data Management workshop covering the data curation lifecycle, licensing, data sharing and secure storage; DCC prices the in-person session at £200 including VAT and includes materials, lunch and coffee breaks. Comparable sessions run through library and research-support teams at institutions including Queen’s University Belfast, the University of Liverpool and the University of Southampton, typically as half-day or full-day introductory sessions for research staff and postgraduate researchers.

    • Short, structured, and usually free or low-cost for staff at the host institution
    • Best suited to researchers and support staff who need a working overview rather than a credential
    • Often the first step before pursuing a certification or specialist school

    MANTRA and free self-paced courses

    MANTRA, developed by the University of Edinburgh’s Data Library, is a free online course for researchers and anyone managing digital data as part of a research project. It works through data management planning, organising and documenting data, storage and security, ethics and copyright, and data sharing, using a self-paced format that fits around existing work commitments.

    The UK Data Service published new introductory research data management course materials in January 2025, developed from workshops aimed at improving skills in managing, documenting, curating and sharing longitudinal research data. Other free options include the University of Edinburgh’s Research Data Management and Sharing MOOC and FOSTER Open Science’s introductory modules — all suited to self-directed learners building foundational RDM literacy before taking on formal steward duties.

    CODATA-RDA Schools of Research Data Science

    For a deeper, internationally recognised grounding, the CODATA-RDA Schools of Research Data Science are an intensive route. Since 2016, CODATA and the Research Data Alliance (RDA) have jointly run these schools, most commonly hosted at the Abdus Salam International Centre for Theoretical Physics (ICTP) in Trieste, alongside regional editions elsewhere. The programme typically runs as a two-week residential course covering data management, statistics, programming and reproducibility skills for researchers at early-career stage, with a strong focus on participants from low- and middle-income countries.

    This route suits candidates who want a rigorous, research-council-recognised training block rather than a single workshop, and who can commit the time to an intensive residential format.

    Professional certifications for data stewards

    Where a role specifically requires a credential, two established professional certifications dominate. DAMA International’s Certified Data Management Professional (CDMP) is offered across four tiers, from Associate to Fellow, and covers the full range of data management disciplines including governance, quality, architecture and metadata — it is widely used as a baseline qualification for data managers moving from adjacent fields such as library science or IT.

    The EDM Council’s Certified Data Steward (CDS) programme formalises data stewardship as a distinct professional designation, testing the ability to apply data quality, governance and metadata-management concepts in practice. The ICCP’s Data Governance and Stewardship Professional (DGSP) credential offers a similar path through foundation-to-executive levels, evaluated through a combination of education, experience and examination.

    Comparing the four training routes

    Route Format Typical duration Cost Best for
    Institutional workshop (e.g. DCC) In-person, single session Half-day to full-day Free–£200 First exposure to RDM concepts
    MANTRA / UK Data Service / MOOCs Self-paced online A few hours to a few weeks Free Self-directed foundational learning
    CODATA-RDA Schools Residential, intensive Around two weeks Application-based, often funded Early-career researchers seeking depth
    CDMP / CDS / DGSP Examination-based certification Weeks to months of study Paid, tiered Formalising a data-steward job title

    Common questions about RDM training

    What is research data management?

    Research data management is the set of practices covering the entire data lifecycle — planning, collecting, storing, documenting, analysing, archiving and sharing data — designed to keep research data accurate, secure and reusable. It applies across disciplines and is increasingly required by funders and institutional policy, not just recommended good practice.

    What is the best certification for data management?

    There is no single “best” option; the right certification depends on career stage. DAMA International’s CDMP is the most widely recognised general credential, while the EDM Council’s Certified Data Steward designation is more targeted at staff whose job title is specifically data steward. Both are examination-based and tiered by experience level.

    What are the 5 pillars of data management?

    Most data governance frameworks, including those underpinning the CDMP and CDS syllabuses, group data management around data quality, data stewardship, data protection and compliance, data architecture, and data governance itself. Training routes vary in how much weight they give each pillar, so checking a course’s syllabus against these five areas is a useful sense-check.

    Choosing a pathway

    For institutions building RDM capacity, a blended approach works best: use free courses such as MANTRA to build baseline literacy across research and library staff, reserve CODATA-RDA School places for early-career researchers who need depth, and require a formal certification such as the CDMP or CDS only where a post is explicitly titled data steward or research data manager. This mirrors how the role sits within the broader research administration function, alongside compliance, funder liaison and governance duties.

    As funder data-sharing requirements tighten, expect more institutions to treat a recognised RDM credential as a minimum bar for steward-titled posts rather than a discretionary extra. Staff who combine a foundational course with a certification, and who understand the wider vocabulary of the field via resources such as CASRAI’s research-data glossary, will be best placed for that shift.

  • Data Sharing Agreement vs Data Processing Agreement: What Research Offices Get Wrong

    A data sharing agreement governs an exchange of personal data between two or more independent data controllers, while a data processing agreement is the contract that Article 28 UK GDPR makes mandatory whenever a controller instructs a processor to handle data on its behalf. Research offices most often need the former for multi-institution collaborations and the latter for any third-party processor, such as a survey platform, transcription service, or cloud host.

    A data sharing agreement is a contract between two or more data controllers who each independently decide how they will use a shared dataset. A data processing agreement is the contract GDPR Article 28 requires whenever a controller engages a processor that acts only on documented instructions and has no independent decision-making power over the data.

    What Is the Difference Between a Data Sharing Agreement and a Data Processing Agreement?

    The confusion research offices run into is structural, not semantic. A data sharing agreement vs data processing agreement question always comes down to one fact: who controls the data, and who merely acts on it. A data sharing agreement documents a controller-to-controller relationship. A data processing agreement documents a controller-to-processor relationship. Everything else — what clauses are mandatory, what liability attaches, whether the ICO expects to see it — follows from that single distinction.

    Two universities pooling anonymised cohort data for a joint publication are both controllers; they need a data sharing agreement. A university engaging a transcription service to convert interview recordings is the controller, and the vendor is the processor; that relationship needs a data processing agreement. The two documents are not interchangeable.

    Feature Data Sharing Agreement (DSA) Data Processing Agreement (DPA)
    Relationship Controller to controller Controller to processor
    Decision-making Each party decides its own purposes and means Processor acts only on the controller’s documented instructions
    Legal mandate Not mandatory in itself, but the ICO’s statutory code treats it as expected good practice Mandatory under Article 28 UK GDPR whenever a processor is engaged
    Typical research use Multi-institution consortia, joint publications, shared registries Survey platforms, transcription services, cloud hosting, statistical consultancies
    Governing source ICO Data Sharing Code of Practice (statutory, under s.121 Data Protection Act 2018) Article 28, UK GDPR / EU GDPR

    When Is a Data Sharing Agreement Required for a Research Collaboration?

    A data sharing agreement becomes necessary the moment two or more organisations — for example, two universities, a university and an NHS trust, or a university and an industry partner — each intend to use a shared dataset for their own research purposes. Under the ICO’s Data Sharing Code of Practice, a statutory code issued under section 121 of the Data Protection Act 2018, a formal agreement is not an absolute legal requirement, but the ICO expects one wherever routine or systematic sharing occurs between controllers, treating it as evidence of accountability under the UK GDPR.

    In practice, most collaborative research grants involving identifiable participant data — clinical cohorts, survey respondents, student records — should have a data sharing agreement in place before data changes hands, regardless of whether the grant terms mention one.

    When Is a Data Processing Agreement Legally Mandatory?

    Unlike a data sharing agreement, a data processing agreement is not discretionary. Article 28 of the UK GDPR requires a written contract wherever a controller uses a processor, and that extends down the chain: if a processor sub-contracts further, another written agreement is needed there too. For a research office, this covers any external service handling personal data on the institution’s instructions without deciding why or how it is used — a data-collection tool, a statistical analysis contractor, or a transcription vendor.

    A data processing agreement must specify the subject matter, duration, and purpose of processing, the categories of data and data subjects involved, each party’s rights and obligations, and the security and breach-notification terms the processor must meet. Missing any of these terms is a compliance gap, not a drafting preference.

    Where Do Joint Controller Arrangements Fit?

    The case research offices most commonly mishandle is not DSA-versus-DPA at all — it is where two institutions jointly determine the purposes and means of processing one dataset, rather than each independently using their own copy. That relationship is governed by Article 26 UK GDPR, which requires a joint controller arrangement setting out each party’s responsibilities, particularly around data subject rights, and requires that the “essence” of that arrangement be made available to data subjects.

    This distinction matters for consortium research funded through instruments such as Horizon Europe, where the Model Consortium Agreement typically sits alongside — not instead of — any joint-controller documentation for personal data. UKRI-funded projects carry a parallel obligation: an approved data management plan is a standard grant condition, but it is a research-governance document, not a substitute for the GDPR-compliant contract.

    A data sharing agreement and a joint controller agreement are frequently confused because both involve multiple controllers. The dividing line is independence of purpose: if each party uses the data for its own separate research question, a data sharing agreement applies; if the parties jointly decide the purpose and means of one processing activity, Article 26 applies instead.

    Frequently Asked Questions

    What Is the Difference Between a DPA and a DSA?

    A DPA (data processing agreement) governs a controller-to-processor relationship and is mandatory under Article 28 UK GDPR. A DSA (data sharing agreement) governs a controller-to-controller relationship and is not strictly mandatory, but is expected best practice under the ICO’s statutory code wherever personal data moves between independent organisations.

    Is a DPA the Same as an NDA?

    No. A data processing agreement specifically governs how personal data is processed under GDPR, including security measures and sub-processor rules. An NDA (non-disclosure agreement) protects confidential information generally — trade secrets, unpublished results, commercial terms — and carries no GDPR obligations of its own. Research collaborations frequently need both, for different purposes.

    Does the UK Use GDPR or DPA?

    Both, and the shared acronym is itself a source of confusion. The UK operates the UK GDPR alongside the Data Protection Act 2018 (DPA 2018), which supplements it domestically. Research offices should note that “DPA” means something different in each context: the Data Protection Act 2018 is UK legislation, while a data processing agreement is a specific contract required under that legislation’s GDPR framework.

    What Is the Difference Between a DPA and a Data Sharing Agreement?

    The same core distinction applies: a data processing agreement binds a processor acting on a controller’s instructions, while a data sharing agreement binds two or more controllers each pursuing their own purposes. Signing the wrong one leaves a research office either over-contracting a simple vendor relationship or under-documenting a genuine controller-to-controller data exchange.

    A Decision Checklist for Research Offices

    Before drafting either document, a research office should establish:

    • Is the other party deciding independently how to use the data, or only following our instructions? Independent use points to a data sharing agreement; instruction-only use points to a data processing agreement.
    • Are two or more institutions jointly deciding the purpose and means of a single processing activity? If so, Article 26 UK GDPR joint controller terms apply, not a standard data sharing agreement.
    • Does the collaboration involve a funder-mandated data management plan (for example, under UKRI or Horizon Europe terms)? A data management plan complements but does not replace the GDPR-compliant contract.
    • Is any processor sub-contracting further processing? Each link in that chain needs its own written data processing agreement under Article 28.
    • Does the exchange involve special category data — health records, genetic data, criminal offence data? These generally raise the bar for documented lawful basis and security terms in either agreement type.

    The Bottom Line for Research Administration

    Research offices that treat data sharing agreements and data processing agreements as interchangeable paperwork expose their institutions to two distinct risks: an unenforced Article 28 obligation with a processor, or an undocumented controller-to-controller exchange the ICO’s statutory code expects to see evidenced. Getting the classification right — controller-to-controller, controller-to-processor, or joint controller — determines which contract is legally required, which is merely good practice, and what each must contain. As multi-institution, multi-funder consortia become the norm, that classification step belongs at the front of every research office’s data governance workflow, alongside the project’s data management plan.

  • Data Papers Explained: Making Datasets Citable

    A data paper is a peer-reviewed journal article whose sole purpose is to describe a dataset — its collection methods, structure, quality controls and reuse potential — so the dataset itself becomes a citable, discoverable research output. This is fundamentally different from a data availability statement (DAS), which is only a short paragraph inside a conventional research article pointing to where supporting data can be found. Understanding the distinction matters for anyone trying to get formal academic credit for data curation work, rather than a passing mention buried in someone else’s paper.

    A data paper is best defined this way: it is a searchable, citable metadata document, published as a standalone peer-reviewed article, whose primary content is the dataset’s provenance, structure and quality rather than a hypothesis or a set of conclusions.

    What is a data paper?

    A data paper is a peer-reviewed document describing a dataset, published in a peer-reviewed journal rather than as an appendix to a conventional study. It concentrates on the “what, why and how” of the data itself — collection methodology, processing steps, structure and known limitations — rather than on testing a hypothesis.

    The format is also known as a data article, data report, data brief or data note, but the function is consistent: it converts curation effort into an indexed, citable scholarly output that gives dataset creators formal academic credit.

    How is a data paper different from a data availability statement?

    A data availability statement is a short, mandatory paragraph within a conventional research article that tells readers where and how to access the data underpinning that paper’s findings. It exists to support transparency and reproducibility of one specific study — it is not a publication in its own right and it is not independently peer reviewed as a scholarly document.

    A data paper, by contrast, is a full standalone publication. It undergoes its own peer review, receives its own DOI, and is indexed and cited independently of any related research article. The table below sets out the practical differences.

    Feature Data paper Data availability statement
    Nature Standalone, peer-reviewed journal article A short section inside another article
    Peer review Independently peer reviewed as a scholarly work Not separately reviewed
    Citability Has its own DOI and citation record Not citable as a discrete work
    Purpose Describe and credit a dataset in depth Point readers to where data for one study lives
    Typical length Several pages, structured like a journal article One to three sentences

    Since 2018, the International Committee of Medical Journal Editors (ICMJE) has required a data sharing statement in reports of clinical trials, and many funders, including UKRI, expect a data access statement in any grant output. Neither requirement is a substitute for a data paper: a DAS satisfies a transparency mandate, while a data paper is the route to scholarly recognition and independent citation of the dataset itself.

    Which journals publish data papers?

    Dedicated data journals have grown substantially since the mid-2010s. According to the Global Biodiversity Information Facility (GBIF), which tracks outlets accepting data papers, article processing charges and impact metrics vary widely by publisher.

    • Scientific Data (Nature Portfolio) — an open-access, online-only journal dedicated to descriptions of scientifically valuable datasets, with a 2024 Journal Impact Factor of 6.9 and an article processing charge of approximately EUR 1,790, per GBIF’s June 2026 tracked figures.
    • Data in Brief (Elsevier) — a multidisciplinary, open-access journal publishing short data articles that describe and give context to datasets, with a 2024 Journal Impact Factor of 1.4 and an article processing charge of approximately USD 1,010.
    • GigaByte (BGI and Oxford University Press) — a CC BY open-access journal for “big data” descriptions across the life, biomedical and environmental sciences, with a 2024 Journal Impact Factor of 1.2, a Scopus CiteScore of 3.2, and an article processing charge of approximately USD 350 — the lowest of the three.

    Discipline-specific alternatives exist too: Earth System Science Data (Copernicus) carries a 2024 CiteScore of 20.6, and Biodiversity Data Journal (Pensoft) charges from around EUR 650. Choice of outlet should follow disciplinary norms, not price alone.

    How do you publish a data paper?

    Publishing a data paper follows a broadly consistent workflow across data journals:

    1. Deposit the dataset first. Upload the data to a recognised repository (for example Dryad, Zenodo or a domain-specific archive) so it receives a persistent identifier before the manuscript is submitted.
    2. Draft the manuscript around the metadata. Describe collection methods, instrumentation, processing pipelines, quality-control steps and known limitations — some tools, such as GBIF’s Integrated Publishing Toolkit, can auto-generate a manuscript draft directly from dataset metadata.
    3. Select a journal matched to the dataset’s discipline. Compare scope, licence terms, and article processing charge against outlets such as Scientific Data, Data in Brief or GigaByte.
    4. Submit for peer review. Reviewers assess the completeness and reusability of the description, not novel findings or conclusions.
    5. Publish and cross-link. On acceptance, the data paper’s DOI should be cross-referenced with the dataset’s own DOI in the repository record, so citation tools can connect the two.

    Why do data papers matter for FAIR data and citation?

    The FAIR Guiding Principles — Findable, Accessible, Interoperable, Reusable — were formalised by Wilkinson and colleagues in a 2016 Scientific Data paper and now underpin funder and repository policy internationally. A data paper operationalises FAIR by attaching a structured, human- and machine-readable description to a dataset that would otherwise carry only minimal repository metadata.

    Dataset citation is governed by the Joint Declaration of Data Citation Principles, published by FORCE11 in 2014, which holds that data merits the same importance, persistence and formal citation treatment as literature. Registration agencies such as DataCite assign the DOIs that make this mechanically possible; a data paper gives readers the narrative context a bare DOI record cannot.

    Frequently asked questions

    What is a data paper?

    A data paper is a peer-reviewed journal article whose primary purpose is describing a dataset’s collection, structure and quality, rather than reporting findings. It gives dataset creators an indexed, independently citable scholarly output.

    How to publish a data paper?

    Deposit the dataset in a recognised repository, draft a manuscript describing its methodology, choose a journal such as Scientific Data, Data in Brief or GigaByte, then submit for peer review that assesses completeness rather than novel conclusions.

    Do you have to pay to publish a data paper?

    Most data journals are open access and charge an article processing charge, ranging from roughly USD 350 at GigaByte to around EUR 1,790 at Scientific Data. Some outlets, including several Pensoft and Copernicus titles, waive or reduce this fee.

    Implications for institutions and funders

    For research administrators, the data paper format offers a concrete way to evidence data-curation effort in tenure, promotion and grant-reporting processes, where a bare data availability statement provides none. Recording named contributions to data creation, curation and description alongside the CRediT contributor role taxonomy gives institutions a fuller, auditable account of who did the data work, distinct from who wrote up the findings.

    Funders increasingly expect both: a data availability statement in the primary research article to satisfy transparency mandates, and — where a dataset has independent reuse value — a data paper to secure its long-term discoverability. Research administrators managing compliance across these overlapping requirements may find it useful to consult a dictionary of research administration terms when mapping funder policy language to practical author guidance.

    Conclusion

    A data paper and a data availability statement solve different problems: one creates a citable, peer-reviewed scholarly record of a dataset; the other simply discloses where supporting data for a specific study can be found. As funders tighten open-data expectations and repositories mature their DOI infrastructure, treating dataset description as a first-class, citable publication — not an afterthought bolted onto a results paper — will matter more, not less, for institutions seeking to demonstrate the full value of the research data they steward.

  • Research Data Manager Job Description, Skills and Career Path

    A research data manager plans, organises and safeguards the data a research project produces — from collection through documentation, storage, sharing and long-term archiving — and is distinct from a data steward (governance-focused) or a research administrator (grants and compliance-focused). The role sits at the intersection of research support, information management and IT, typically inside a university’s library, research office or a funded project team.

    This guide sets out the research data manager job description, the skills and qualifications employers ask for, how the role differs from adjacent titles, and the realistic career path from entry-level data support through to strategic data leadership.

    What is a research data manager?

    A research data manager is the named individual responsible for a project’s or department’s data management plan, metadata standards and repository deposits. The role exists because funders increasingly require a documented, reusable dataset alongside every publication, not just the paper itself.

    The task is not new — it maps closely to the Data Curation contributor role in the CRediT taxonomy, defined as “management activity to annotate, scrub data and maintain research data for initial use and later re-use.” CASRAI originated the CRediT contributor role taxonomy in 2014; the standard is now stewarded by NISO as ANSI/NISO Z39.104-2022, and Data Curation remains one of its 14 defined roles — evidence that the function research data managers perform has been formally recognised in scholarly attribution for over a decade.

    What does a research data manager do day to day?

    Day-to-day work centres on making a project’s data findable, well-documented and safely stored, then repeatable for the next study. Typical duties, drawn from published UK university and NHS job descriptions, include:

    • Drafting and reviewing data management plans (DMPs) for grant applications
    • Setting up and maintaining databases, spreadsheets and case report forms for a study
    • Applying metadata standards so datasets are discoverable in institutional or subject repositories
    • Coordinating deposit of datasets with DataCite-registered DOIs for citation and reuse
    • Running data quality checks, version control and access permissions across a research team
    • Training researchers and doctoral students in good data management practice
    • Advising on compliance with funder data policies and data protection legislation

    Research data manager vs data steward vs research administrator

    These three titles are frequently confused in job adverts because responsibilities overlap, but their primary focus and reporting line differ. The table below distinguishes the three roles as they typically appear in UK higher education and research institutions.

    Dimension Research Data Manager Data Steward Research Administrator
    Primary focus Lifecycle management of a specific project’s or department’s datasets Institution-wide data governance, quality rules and ownership policy Grant administration, compliance and researcher support
    Typical base Research office, library or funded project team IT services, information governance or central data office Research office, faculty or funder-facing team
    Core output Data management plans, metadata, repository deposits Data policies, classification schemes, access controls Grant applications, contracts, financial and ethics reporting
    Professional body Often affiliated with library/data-curation networks Information governance and data protection networks ARMA (UK/Ireland), EARMA, INORMS, NCURA
    Typical entry route Data science, library/information studies, life sciences degree IT governance, information management background Any discipline plus research administration training

    What skills, qualifications and training are required?

    Employers combine technical data skills with domain and communication skills, since the role requires translating funder and disciplinary requirements into practical workflows researchers will actually follow.

    • Data handling: spreadsheet and database competence; SQL, Python or R are increasingly listed as desirable
    • Standards knowledge: metadata schemas, DataCite, ORCID identifiers, and repository deposit workflows
    • Policy literacy: UK GDPR, funder data policies, and institutional research governance frameworks
    • Communication: training researchers, writing plain-English guidance, negotiating with study sponsors
    • Project management: running parallel studies to funder deadlines with limited resource

    Formal training routes include postgraduate qualifications in library and information science or data science, plus shorter dedicated courses. The Digital Curation Centre (DCC), funded by Jisc, has provided UK universities with research data management guidance and training resources since 2004 and remains the primary UK reference point for RDM practice. Institutional RDM obligations trace back to funder policy: EPSRC’s research data expectations, effective from 1 May 2015, require UK institutions receiving its funding to publish a research data management policy and a roadmap for compliance. The 2016 Concordat on Open Research Data — jointly published by Research Councils UK, Universities UK, Wellcome Trust and HEFCE — set out ten principles establishing that data management planning should be integral to research design, reinforcing why institutions now hire dedicated staff for this function rather than leaving it to individual researchers.

    What is the typical career path and salary range?

    Entry typically begins in a data assistant or data curator post supporting a research team’s day-to-day data handling, often on a fixed-term contract tied to a specific study. Real UK job postings illustrate the entry tier clearly: an NHS Research Data Manager post advertised in May 2025 by Midlands Partnership NHS Foundation Trust was graded at Agenda for Change Band 4, with a salary of £26,530 to £29,114 a year.

    Progression moves through Research Data Manager (owning DMPs and repository workflows for a department or portfolio of studies) to Senior/Lead Research Data Manager, where the postholder sets institutional RDM policy and may supervise a small team. The most senior tier — Director of Research Data Services or equivalent — sets strategic direction for an institution’s entire research data infrastructure and reports into the research office or library leadership. Unlike research administration, a PhD is not a standard requirement at any tier, though it is common among staff who progress from a research role into data management.

    Common questions about the role

    What are the responsibilities of a data manager?

    A data manager is responsible for the entire data lifecycle: collection, quality control, storage, security, documentation and eventual archiving or disposal. In a research context this extends to writing data management plans, applying metadata standards, and coordinating repository deposit so datasets remain reusable after a project ends.

    What does a research data manager do?

    A research data manager develops and implements the policies, workflows and documentation that keep a project’s or department’s datasets organised, secure and discoverable. Duties include drafting data management plans, training researchers, running quality checks, and depositing data with persistent identifiers such as DataCite DOIs for citation and reuse.

    What is the salary of a data manager?

    Salaries vary widely by sector and seniority. A UK NHS-graded entry-level research data manager post advertised in 2025 sat at Agenda for Change Band 4, paying £26,530–£29,114 a year; senior and director-level research data roles in universities and industry command substantially higher salaries, reflecting added strategic and line-management responsibility.

    What are the 4 types of research data?

    Research data is commonly grouped into primary data (collected directly for the study), secondary data (reused from existing sources), and quantitative versus qualitative data by format. A research data manager must apply appropriate metadata, storage and sharing rules to each type, since funder and ethical requirements differ across them.

    What this means for institutions and job seekers

    For institutions, the job description confusion between research data manager, data steward and research administrator is itself a risk: unclear scoping leads to duplicated effort or gaps in funder compliance. Writing role descriptions that reference recognised frameworks — the CRediT Data Curation role, DCC guidance, and funder RDM policy — gives hiring managers a defensible, standards-aligned specification rather than an ad hoc list of duties.

    For job seekers, the clearest differentiator to lead with on an application is lifecycle ownership of data, not general IT or administrative competence. As funders continue tightening open-data mandates, demand for staff who can demonstrate metadata standards knowledge, repository deposit experience and DMP authorship is likely to keep outpacing supply, making this one of the more durable specialisms within the broader research administration and support ecosystem.

    For related roles and standards context, see CASRAI’s CRediT contributor roles hub, the research administration dictionary, and the research administration pillar.

  • ADR UK Explained: Administrative Data Access for Social Scientists

    ADR UK (Administrative Data Research UK) is a UK-wide partnership that gives accredited researchers secure access to de-identified, linked government administrative data — held not in a conventional downloadable repository, but inside supervised Trusted Research Environments (TREs). For social scientists, this matters because it is a distinct access route: the data never leaves government custody, and the researcher, not the dataset, is what gets vetted and admitted.

    ADR UK is a partnership of four national bodies — ADR England, ADR Scotland, ADR Wales and ADR Northern Ireland — together with the Office for National Statistics (ONS), coordinated by a UK-wide Strategic Hub and funded by the Economic and Social Research Council (ESRC), part of UK Research and Innovation (UKRI).

    What is ADR UK?

    ADR UK is the mechanism by which public sector administrative data — records originally collected for tax, benefits, education, health or justice administration, not for research — is linked, de-identified and made available for social science research in the public interest. It commissions flagship linked datasets, funds research using them, and maintains a public data catalogue describing what is available and to whom.

    The partnership operates under the Digital Economy Act 2017, which created the legal gateway allowing UK government bodies to share de-identified data with accredited researchers for statistical research purposes. This is the statutory basis that distinguishes ADR UK access from a voluntary data-sharing agreement between two universities.

    How does ADR UK access differ from conventional repository deposit?

    Most research data infrastructure — repositories, DataCite-indexed archives, institutional data stores — is built around deposit and download: a dataset is prepared, described with metadata, and released for reuse under a licence. ADR UK’s model inverts this. The data is never released to the researcher’s own machine; instead, the researcher is admitted into a controlled environment where the data already resides.

    This is best understood as “FAIR-adjacent” rather than FAIR-compliant in the open-repository sense: the data is findable (via the catalogue) and, under approval, accessible, but interoperability and reusability are deliberately constrained by design, because the underlying records are personal and sensitive at source. The table below maps the three routes UK researchers commonly encounter.

    Route Access model Typical data Governing framework
    ADR UK Supervised Trusted Research Environment (TRE); no download Linked cross-government administrative data (education, benefits, justice, tax) Digital Economy Act 2017; Five Safes
    NHS Secure Data Environments Supervised SDE; “dissemination by exception” NHS health and social care records NHS England’s 2022 Secure Data Environment policy
    UK Data Service Deposit/download under end-user licence Social surveys, census, cross-national socioeconomic data ESRC-funded repository terms

    The practical consequence for a social scientist: an application to ADR UK is an application for supervised admission to a workspace, not a request for a file transfer.

    What is the Five Safes model and what is a Trusted Research Environment?

    ADR UK access is governed by the Five Safes model, a risk-management framework originally developed by the ONS and now used across UK administrative data infrastructure, including NHS Secure Data Environments. It manages disclosure risk across five dimensions rather than relying on a single control.

    • Safe people — only accredited, trained researchers gain access.
    • Safe projects — proposals are approved for public benefit and ethical soundness.
    • Safe data — records are de-identified before linkage.
    • Safe settings — analysis happens only inside a Trusted Research Environment, a monitored, non-internet-connected computing environment.
    • Safe outputs — every result is disclosure-checked before it can leave the TRE.

    Each of the four UK nations operates its own TRE, accessed in person at a designated safe location or via a secure remote connection, using approved statistical software such as R, Python, SPSS or Stata.

    Who is eligible, and how does accreditation work?

    Eligibility runs through the researcher, not the institution. Under the Digital Economy Act 2017 accreditation process, an applicant must complete Safe Researcher Training and pass an assessment before an accreditation panel will approve them; this status is valid for five years. Accreditation alone does not grant data access — a specific research project must then be separately approved against public-benefit, feasibility and ethics criteria before a TRE account is issued.

    For institutions supporting early-career or interdisciplinary social scientists, this two-stage gate (accredit the person, then approve the project) is the single most common point of delay administrators should plan for, since neither step can be skipped or run in parallel with data linkage preparation.

    How is ADR UK funded and governed?

    ADR UK began as an ESRC investment running from July 2018. In September 2020, UKRI, the Department for Business, Energy and Industrial Strategy and HM Treasury approved £15.3 million for the 2021/22 financial year — the first year of a planned five-year investment. In September 2021, the remaining £90.12 million of that investment was secured from UK government to extend the programme to March 2026. In July 2025, UKRI confirmed a further £168 million investment to continue the programme beyond 2026, securing its next phase.

    Governance sits with the UK-wide Strategic Hub, which coordinates the four national partnerships, engages with government departments to secure data access agreements, and administers the dedicated research grant fund — distinct from the accreditation function, which remains with the statutory panel under the Digital Economy Act 2017.

    Frequently asked questions

    Is ADR UK the same thing as “alternative dispute resolution”?

    No. ADR UK in a research-administration context refers exclusively to Administrative Data Research UK, the government-data access partnership described here. “ADR” also commonly abbreviates alternative dispute resolution in a legal context — an unrelated field covering mediation and arbitration — and searchers should check context before assuming which meaning applies.

    What kind of data does ADR UK provide access to?

    ADR UK provides access to linked, de-identified administrative data generated by government departments — including education records, benefits and employment data, and justice-system data — rather than data collected specifically for research, such as surveys. Its public data catalogue and flagship datasets list what is currently available to accredited researchers.

    Is ADR UK data FAIR or open access?

    ADR UK data is not open access and is only FAIR-adjacent: it is findable through the catalogue and accessible to accredited, approved researchers, but it cannot be freely downloaded, reused or redistributed, because the source records are personal and disclosive. Outputs, not raw data, are what eventually leave the Trusted Research Environment.

    How long does the ADR UK access process take?

    Timelines vary, but researchers should expect two sequential approval stages: Safe Researcher Training and accreditation first, then a separate project-specific approval before a Trusted Research Environment account is issued. Institutions should budget for both stages when planning grant timelines, since data linkage itself begins only after project approval.

    What this means for research administrators and institutions

    For institutions supporting quantitative social science, ADR UK access is a compliance and planning question as much as a technical one. Research offices should treat Safe Researcher Training and accreditation as a standing institutional capability — something built into PhD and postdoctoral training pipelines — rather than a one-off hurdle discovered mid-grant. Because accreditation is personal and portable across five years, institutions that pre-accredit staff gain a durable advantage in bidding for ADR UK-linked funding calls.

    The broader signal is that “FAIR-adjacent” access, governed by statute and a risk framework rather than a licence, is becoming a parallel track alongside conventional repository deposit — one that other data-holding sectors, including health, are converging on through NHS Secure Data Environments. Research administrators who understand both tracks are better placed to route projects to the correct infrastructure the first time.

  • UK Data Service vs ICPSR: Choosing an Archive

    The UK Data Service and ICPSR are the two largest social-science data archives in the English-speaking research world, and the right choice usually depends on jurisdiction and funder mandate rather than feature parity. The UK Data Service is the ESRC-funded national repository for UK social, economic and population data, while ICPSR is a US-based, membership-funded consortium archive at the University of Michigan. Researchers outside the biomedical repository ecosystem — where PubMed-linked mandates dominate — need to weigh deposit workflow, restricted-access tiers and citation practice before picking either as a home for a dataset.

    The UK Data Service is the largest digital repository for quantitative and qualitative social science and humanities research data in the United Kingdom, formed in October 2012 when the Economic and Social Research Council (ESRC) consolidated the UK Data Archive — established at the University of Essex in 1967 — with several university partners. ICPSR, by contrast, is a membership consortium of academic and research institutions that has archived social and behavioural science data since 1962. Both are listed in re3data.org, the global Registry of Research Data Repositories, and both hold CoreTrustSeal certification for trustworthy digital repositories.

    What Are the UK Data Service and ICPSR?

    The UK Data Service is a national data repository funded through UKRI’s Economic and Social Research Council (ESRC) and led by the UK Data Archive at the University of Essex, in partnership with the University of Manchester, Jisc, EDINA and University College London. It holds more than 6,000 datasets, including UK Census data, the Labour Force Survey, the Millennium Cohort Study and cross-national surveys such as the European Social Survey.

    ICPSR — the Inter-university Consortium for Political and Social Research — is a membership-funded archive based at the University of Michigan, serving several hundred member institutions worldwide alongside non-member depositors and users. Its holdings span large-scale US and international surveys, criminal justice, education and ageing data, and it runs openICPSR as a self-publishing companion repository for rapid dissemination.

    How Do Deposit Workflows Compare?

    Both archives run a curated deposit model rather than a bare-metal upload box: staff review documentation, check disclosure risk and enhance metadata before release. The UK Data Service’s ESRC funding creates a contractual hook — grant holders are required to offer their data for archiving as a condition of the ESRC Research Data Policy — which ICPSR’s membership model does not replicate for non-US funders.

    • UK Data Service: two routes — the main curated collection for large, complex or sensitive studies, and ReShare, a lighter self-deposit repository for smaller datasets, code and syntax files.
    • ICPSR: two routes — the standard curated deposit process, and openICPSR, a self-publishing repository for researchers who want faster turnaround with lighter-touch review.

    Depositors submitting to either service should expect a documentation checklist covering variable-level metadata, consent and ethics evidence, and a data management plan — the same categories UKRI and NSF grant terms typically require regardless of which archive receives the deposit.

    How Do Restricted-Access Tiers Differ?

    Access tiering is where the two services diverge most for researchers working with confidential or disclosive social-science data. The UK Data Service operates a published three-tier model; ICPSR uses a comparable but differently named structure built around its Virtual Data Enclave.

    Access dimension UK Data Service ICPSR
    Open tier No registration; Open Government Licence data Public-use files via free MyData account
    Standard tier Safeguarded — registration plus End User Licence Member-institution access under consortium terms
    Restricted tier Controlled — SecureLab, requiring accredited-researcher training under the Five Safes Framework Restricted-use data via secure Virtual Data Enclave or encrypted physical media, subject to a data security plan
    Governance standard Accredited under the Digital Economy Act 2017 by the UK Statistics Authority (2020) Institutional Review Board and data-use-agreement based review

    The UK Data Service’s Five Safes Framework — safe people, projects, settings, data and outputs — was developed with HMRC DataLab and the Office for National Statistics Secure Research Services, and now underpins the SafePod Network launched in 2021 for wider geographical access to sensitive data. ICPSR’s restricted-data pathway achieves an equivalent security outcome through its enclave model but does not use the Five Safes terminology, which matters for UK researchers writing data management plans against ESRC or UKRI templates that reference it explicitly.

    How Do Citation Practices Compare?

    Both archives assign persistent identifiers and expect formal data citation, but their machinery differs. The UK Data Service works with DataCite and the British Library to issue DOIs and promotes an easy-to-use citation tool, framing its approach around the FAIR data principles — Findable, Accessible, Interoperable, Reusable — and its open-source QAMyData tool, which gives depositors a health check for numeric data before release.

    ICPSR similarly issues persistent identifiers for deposited studies and expects citation in publications that reuse its data, but its emphasis sits more on bibliography-style study citations tied to its own numbering system than on a dedicated public FAIR-compliance tool. For researchers publishing in journals that enforce data-availability statements — a growing requirement under funder open-science mandates — the practical difference is smaller than the access-tier gap: both produce a citable, resolvable record, but only the UK Data Service publishes a named QA tool for pre-citation data quality.

    Which Archive Should Researchers Outside Biomedicine Choose?

    For most projects the decision is jurisdictional rather than qualitative. A research data repository choice driven by funder mandate removes ambiguity immediately: ESRC-funded UK researchers must offer data to the UK Data Service, while NSF- or NIH-adjacent US social-science grants more commonly point toward ICPSR or openICPSR.

    • Choose the UK Data Service if your funder is UKRI/ESRC, your data concerns UK administrative, census or longitudinal panel data, or you need SecureLab/Five Safes access to controlled government microdata.
    • Choose ICPSR if your institution is a consortium member, your data is US-focused or cross-national with US partners, or you want the faster openICPSR self-publishing route.
    • Consult both catalogues before depositing internationally comparable survey data (e.g. European Social Survey, Eurobarometer) — coverage overlaps, and the UK Data Service can facilitate UK-based access to ICPSR holdings.

    Institutions building or reviewing a data management plan should treat this as a data repository for research compliance question first and a discoverability question second: a technically excellent dataset deposited in the wrong repository for its funder mandate creates avoidable rework at grant closeout.

    Answer-First Questions Researchers Ask

    What Is the UK Data Service?

    The UK Data Service is the ESRC-funded national repository for UK economic, population and social research data, led by the UK Data Archive at the University of Essex. It holds over 6,000 datasets, including census, survey and longitudinal study data, and operates under the OAIS digital-preservation reference model.

    How Do You Access Data on the UK Data Service?

    Access runs through three published tiers: Open data requiring no registration, Safeguarded data requiring registration and an End User Licence, and Controlled data requiring SecureLab accreditation under the Five Safes Framework. Most researchers start with the free data catalogue and register once they identify a specific study.

    Is the UK Data Service Free?

    Yes — the service is free to data owners depositing studies and free at the point of use for non-commercial research and teaching. Commercial users may incur administrative fees, and controlled-tier access requires accredited-researcher training rather than a monetary charge.

    Implications for Research Administrators

    Data management plans reviewed by institutional research offices, ARMA and INORMS-aligned research administrators, and funder compliance teams increasingly treat repository choice as an auditable field, not a footnote. A UK-funded study archived outside the UK Data Service without documented justification can trigger ESRC compliance queries at final reporting; a US consortium study left undeposited with ICPSR can weaken an institution’s case for renewed membership funding. Neither archive competes with domain-specific biomedical repositories governed by NISO, ICMJE or COPE norms — this comparison sits squarely in the national data repository space for social science, distinct from that ecosystem.

    As open-science mandates from UKRI, cOAlition S and equivalent US funders converge on FAIR-by-default expectations, the operational gap between the UK Data Service and ICPSR is narrowing to jurisdiction, access-tier terminology and citation tooling rather than underlying trustworthiness — both hold CoreTrustSeal certification and both sit inside the CESSDA/re3data recognised-repository landscape that funders now check by default.