CASRAI Dictionary

Tag: CRediT

Mentorship as a CRediT role: pro and con
The CRediT Supervision role is broad. The role definition reads: Oversight and leadership responsibility for the research activity planning and execution, including mentorship external to the core team. The role bundles mentorship into Supervision, which leaves the question: should mentorship be a CRediT role of its own? This post lays out the arguments on both sides and proposes a mid-path.

The case for

Three arguments for a dedicated Mentorship role.

First, visibility. Mentorship is a substantial intellectual and time-consuming activity that current CRediT-style contributorship statements largely render invisible. A senior researcher who mentored an early-career colleague through the discovery, the writing, and the navigation of peer review has contributed significantly to the paper; the current taxonomy captures this only through the catch-all Supervision role, which is also used for project oversight that is quite different in character.

Second, career-stage equity. Mentorship contribution is most often delivered by mid-career and senior researchers to early-career ones, and is most often invisibilised in the way it currently is. Making it a CRediT role would help correct the under-recognition of mid-career mentorship work in promotion and tenure decisions. The mentorship and career stages domain at CASRAI tracks the assessment-side implications.

Third, distinction from supervision. Project supervision (the senior researcher with PI responsibility) and mentorship (the senior or peer researcher who guided a junior contributor’s development through the work) are different activities. Bundling them into one role loses the distinction. A paper where the PI did the supervision and a separate mid-career colleague did the mentorship has a contributorship structure that current CRediT cannot express cleanly.

The case against

Three arguments against.

First, taxonomic stability. CRediT has held to 14 roles deliberately. Each addition raises cognitive load and risks the taxonomy becoming unusable through over-specification. Liz Allen and the original CRediT designers have consistently argued that the taxonomy gains value from being small enough to use; adding a Mentorship role pushes against this.

Second, boundary problems. What distinguishes mentorship from supervision, from teaching, from co-authorship, from collaboration? The lines are real but fuzzy. A senior colleague who reviewed the draft and suggested major revisions is doing Writing – review & editing; the same colleague who guided the junior author through how to think about the discovery is doing mentorship; in practice the activities overlap. A role that requires reviewers to distinguish them may produce noise more than signal.

Third, recognition versus contribution. CRediT is a contributorship taxonomy, describing what people did on the paper. Mentorship is broader than per-paper contribution; it is a sustained relationship that spans many papers and many years. Capturing per-paper mentorship in CRediT may be the wrong instrument; a separate mentorship-recognition mechanism (in narrative CVs, in promotion dossiers, in institutional mentorship programmes) may fit better.

A proposed mid-path

We propose a mid-path that addresses the visibility and equity concerns without expanding the CRediT role count.

First, clarify the Supervision definition. The current definition bundles mentorship with project leadership. The bundling could be unbundled within the existing role through definitional refinement: the role description could be revised to explicitly recognise mentorship as a sub-activity within Supervision, with guidance on when each is being discharged. This is a low-cost intervention that does not require a new role.

Second, add a structured qualifier for Supervision. The existing degree-of-contribution qualifier already provides lead/equal/supporting. A sub-qualifier indicating whether the Supervision was project-oriented, mentorship-oriented, or both, would add the granularity without adding a role. This is a small schema change with substantial value.

Third, build the recognition layer outside CRediT. The narrative-CV format, mentorship-specific recognition programmes, and institutional career-development frameworks should carry mentorship recognition at a sustained-relationship granularity that CRediT cannot. The mentorship recognition that early-career researchers most value is not the per-paper notation; it is the cumulative recognition of mentorship across a career. The CASRAI institutional mentorship guide walks through the recognition options.

What CRediT v2026.3 should do

Our recommendation for the v2026.3 revision discussion: do not add a Mentorship role; do refine the Supervision definition to recognise mentorship explicitly; do add a sub-qualifier capturing the project/mentorship/both distinction; do coordinate with the narrative-CV and institutional-recognition communities to ensure that the cumulative mentorship recognition picture is captured outside CRediT.

This is the position we lean toward, with the explicit acknowledgment that reasonable people disagree. The discussion at the December 2025 CRediT stewardship meeting was substantive; the community consultation through 2026 will be the place to settle it. The CASRAI CRediT governance page tracks the consultation process and welcomes input from the broader community.

A broader observation

The mentorship question is one instance of a broader pattern. CRediT, as a per-paper contributorship taxonomy, captures certain things well and certain things less well. The work that spans papers (sustained mentorship, leadership of a community, contribution to standards, infrastructure stewardship) does not fit naturally into a per-paper taxonomy. The right response is not to expand CRediT to cover everything but to build complementary recognition mechanisms for what CRediT does not capture.

This is the argument running through the responsible-assessment community, the narrative-CV adoption push, and the CoARA reform agenda. CRediT is part of the picture, not the whole picture. A senior researcher’s contribution profile is captured by CRediT statements on their papers, by their narrative CV, by their teaching record, by their mentorship record, by their service to the community. The integrated picture is the goal; CRediT is one component.

Practical recommendations

Three for institutions. First, capture mentorship in your institutional records and recognition systems; do not wait for it to be a CRediT role. Second, train promotion-and-tenure committees to read mentorship contribution explicitly when reviewing dossiers. Third, support narrative-CV formats that surface mentorship.

Three for researchers. First, claim your mentorship contribution in narrative CVs and professional records; do not depend on per-paper CRediT to capture it. Second, in CRediT statements, use Supervision appropriately and consider noting the mentorship dimension in the prose contribution statement that accompanies the structured CRediT. Third, contribute to the CRediT consultation if you have a view on the question.

Three for the CRediT stewardship community. First, run the v2026.3 consultation transparently and document the outcomes. Second, coordinate with the responsible-assessment community on the broader recognition picture. Third, treat the question of taxonomic expansion as a serious one with substantive trade-offs, not as a routine update.

Related dictionary entries
May 20, 2026
CRediT for AI-generated content: where the line is
The ICMJE 2023 position is settled: artificial-intelligence systems cannot be authors. The follow-on question that journals and authors continue to negotiate is how to represent, in a contributorship statement, the human work that goes into producing AI-assisted content. When a co-author prompts an LLM to draft a section, verifies the output, edits it, and stands behind it, which CRediT role describes their contribution? This post proposes a working line.

The shape of the question

Three scenarios make the question concrete.

Scenario one: an author uses an LLM to polish prose in a draft they wrote. The intellectual content is theirs; the language is partly the model’s. The CRediT role is straightforwardly Writing – original draft for the author.

Scenario two: an author uses an LLM to draft a first version of a section, which they then heavily revise. The first draft is the model’s; the final draft is the author’s, but the model substantively shaped what the final draft says. The CRediT role is still Writing – original draft for the author, but the contribution is meaningfully different from scenario one.

Scenario three: an author uses an LLM to propose a study design, which they then refine. The intellectual content of the methodology was partly the model’s. The CRediT role is Methodology for the author, but again the contribution is meaningfully different from the unaided version.

In all three, the human author is the role-holder; the model is not a co-author. What is different across the scenarios is the magnitude and the character of the human contribution. CRediT, as currently constituted, does not distinguish these.

The working line

Our proposed line is the verification-and-responsibility threshold. A human contributor who has substantively verified the AI-generated content, taken responsibility for it, and is prepared to defend it in correspondence or post-publication discussion is properly credited with the relevant CRediT role. The role describes what they contributed to the paper, which includes verification work even if the first-draft work was the model’s.

The line shifts where the human contribution is insubstantial — a contributor who pasted a prompt, accepted the output without verification, and added their name to the paper has not discharged the role and should not be credited. This is the same line that has always applied to non-AI cases (a co-author who did not contribute should not be credited; gift authorship is a well-recognised failure mode).

The line is therefore not about AI use per se; it is about whether the human contribution clears the substantive-contribution threshold. AI use does not displace the threshold; it changes what discharging the role looks like in practice.

Disclosure runs parallel

The disclosure of AI use is a separate question, addressed via publisher-mandated AI disclosure declarations. The disclosure says what tools were used and for what; the CRediT statement says who contributed what to the paper. The two run parallel and are both required by most major publishers in 2026. The CASRAI AI disclosure for authors guide walks through the publisher-by-publisher requirements.

Implications for specific roles

Writing – original draft

The most common case. A human author whose draft was AI-assisted is properly credited with Writing – original draft if they verified the content, took responsibility for it, and produced the version that is the paper. The disclosure declaration says the AI was used; the CRediT statement names the human as the writer-of-record.

Methodology and Formal analysis

More delicate. If an AI-assisted statistical-discovery tool proposed a method or an analytic approach, the human contributor’s role is partly verification (was the proposal sound?) and partly extension (refining the proposal into the actual method). The CRediT role is still Methodology and/or Formal analysis for the human, but the verification dimension is foregrounded. If the human did not verify — accepted the AI proposal without independent assessment — the contribution is weaker and may not clear the role threshold.

Investigation

A subtle case. AI-assisted data extraction (e.g., from imaging, from medical records, from text corpora) involves a human contribution that runs from setup through verification to interpretation. Investigation includes the data-gathering activity; an AI-assisted version still has a human Investigation lead, who is responsible for the setup, the verification of extracted data, and the handling of errors.

Validation

Perhaps the most directly affected. Where AI tools are used for cross-checking, sensitivity analyses, or reproduction of results, the human Validation contributor is responsible for setting up the validation, interpreting its results, and acting on discrepancies. The AI does the mechanics; the human does the judgement.

Visualization

AI-assisted figure generation is increasingly common. The human Visualization contributor is responsible for the figure-design decisions, for verifying that the AI-generated figure accurately represents the data, and for the final version that appears in the paper. Where the AI generated an image that the human did not substantively verify, the threshold may not be cleared.

The role-as-recognition trap

A failure mode to flag explicitly. The temptation, when AI did most of the actual production work, is to inflate the human contributor’s role assignment to compensate. “The AI wrote the draft, but I prompted it, so I should still be Lead on Writing – original draft.” This is a misreading. The CRediT role is a description of contribution; if the human contribution was “prompted and accepted”, that is a smaller contribution than “drafted, verified, revised, took responsibility.” Calling both “Lead” obscures the difference.

The remedy is the degree-of-contribution qualifier. A human contributor whose AI-assisted contribution was substantial may be Lead; one whose contribution was lighter may be Supporting. The qualifier discipline forces an honest assessment of magnitude.

Where this leaves the AI-assistance-role question

We have argued elsewhere that a 15th CRediT role explicitly for AI assistance is worth considering. The argument from this post is partly orthogonal: the existing 14 roles can accommodate AI-assisted work if the verification-and-responsibility threshold is honoured and the qualifier is used honestly. The case for a 15th role rests on whether the structured disclosure-of-AI-use is better placed inside the contributorship statement or outside it. Reasonable people disagree; we lean toward keeping AI disclosure parallel to CRediT rather than inside it, with attention to the verification-and-responsibility line.

Practical recommendations

Three for authors. First, treat AI assistance as a tool, not a substitute. Verify, edit, and take responsibility for what appears in the paper. Second, assign CRediT roles based on what you contributed including verification, not based on what the AI produced. Third, disclose AI use in the publisher-mandated declaration; the disclosure runs parallel to CRediT, not inside it.

Three for editors. First, treat the verification-and-responsibility threshold as the operating standard for AI-assisted contributorship. Second, require both the CRediT statement and the AI-use disclosure at submission. Third, where a contributorship statement looks like it may reflect AI-assistance role inflation, ask the standard editorial question: what did this contributor actually do?

Three for the broader system. First, harmonise AI-disclosure formats across publishers (work the NISO and COPE community has begun). Second, maintain the contributorship-versus-disclosure separation; do not collapse them. Third, evaluate the case for a 15th CRediT role on its merits, including the costs of taxonomic expansion.

Related dictionary entries
May 18, 2026
How the Software role applies to code-only outputs
A growing fraction of research output is code: software libraries that implement a method, computational notebooks that demonstrate an analysis, simulation frameworks that enable a body of work, infrastructure tooling that supports a research community. When the output is primarily code, the CRediT Software role carries weight that the role’s brief definition does not fully prepare it for. This post is a practical guide to assigning Software in code-centric contexts.

The Software role, briefly

The CRediT Software role is defined as: Programming, software development; designing computer programs; implementation of the computer code and supporting algorithms; testing of existing code components. The definition is short and was written with software-as-tool-for-a-paper in mind, not software-as-the-paper.

For a conventional research paper where someone wrote analysis code that supported the science, Software is straightforward: the person who wrote the analysis code gets the role. For a paper whose primary scholarly contribution is the code itself — a JOSS paper, a software-methods paper, a tool announcement — Software is the dominant role and the brevity of the definition starts to bite.

What the Software role should cover in a code-only context

Our recommendation, distilled from the practice of JOSS, the Software Sustainability Institute, the Research Software Engineers community, and several years of CASRAI editorial work, is to read Software in code-only contexts as encompassing the following five sub-activities, all of which should be visible in the contributorship statement even if they share the role.

Implementation: writing the production code itself. This is the core of Software and is what people most naturally associate with the role.

Architecture and design: the higher-level decisions about how the code is structured, what its dependencies are, how its modules interact. In a code-only paper, architecture is part of the intellectual contribution and the architect should be a co-author with Software role.

Testing: writing the test suite, including unit tests, integration tests, and regression tests. A code-only paper with a credible test suite has someone who built it.

Documentation: user-facing documentation, developer-facing documentation, README, examples, tutorials. For code intended for reuse, documentation is part of the deliverable; the documentation contributor gets the Software role.

Packaging and release: the engineering work of making the code installable, citable, and citation-resolvable. CI/CD configuration, dependency management, release-tagging, DOI registration. For long-lived code with multiple releases, this is sustained work; for a one-off code release accompanying a paper, it is still non-trivial.

Each of these is meaningful contribution that the Software role captures. A code-only paper’s CRediT statement should make the distribution of these activities across contributors visible, using the lead/equal/supporting qualifier to express relative magnitude.

Where Software overlaps with other roles

Three overlaps deserve attention.

First, Software versus Methodology. If the code implements a novel method, the method itself is a Methodology contribution; the implementation is a Software contribution. The same person often discharges both, and the contributorship statement should assign both roles to them. The error to avoid is conflating the two: assigning Software while omitting Methodology under-represents the intellectual contribution.

Second, Software versus Validation. Writing tests is Software (per the definition); validating the code against reference implementations or independent data is Validation. The distinction is genuine: tests verify that the code does what the developer intended; validation verifies that the code does what is scientifically correct. Both belong in a code-only paper’s contributorship.

Third, Software versus Writing – original draft. The README, the developer documentation, the API reference — these are documentation, captured under Software. The paper itself, including its method description and its discussion of design choices, is captured under Writing – original draft. The boundary is the publication artefact: anything in the paper is Writing; anything in the code repository is Software.

Cross-referencing with CITATION.cff

The CITATION.cff convention, increasingly standard in scientific software repositories, provides a richer contributor model than CRediT alone. CFF supports author, contact, and contributor entries with type-of-contribution fields; integrators have extended it with CRediT-aligned vocabularies. The recommended pattern for a code-only paper is to maintain both: a CRediT statement in the paper (for the paper-level contributorship) and a CITATION.cff in the repository (for the per-version, per-component contributorship that CRediT cannot express).

The two should be consistent. A contributor named in the paper with Software role should appear in the CITATION.cff with at least equivalent contribution; a contributor named in the CITATION.cff but not in the paper should be acknowledged in the paper’s acknowledgements section. The CASRAI CITATION.cff entry walks through the integration patterns.

The maintenance question

An unresolved aspect of Software in code-only contexts is how to credit maintenance over time. A research software package may have a paper at first release, with a CRediT statement reflecting the founding contributors. Five years and several major versions later, the package has new maintainers, new contributors, and a substantially different code base. The original paper’s CRediT statement is increasingly out of date.

The current pragmatic answer is: the paper’s CRediT statement freezes at publication; the CITATION.cff in the repository tracks current contributorship; downstream citation should reference both, with the paper as the publication-of-record and the CFF as the current-contributor record. This works but is imperfect. The Software Citation Working Group has been chewing on whether per-version CRediT statements, deposited to Crossref via the related-identifier mechanism, would be a cleaner answer; the proposal is technically viable but not yet a community consensus.

What journals should do

For journals publishing software papers, the recommended editorial practices are: require CRediT with qualifiers in the paper; require a CITATION.cff in the linked repository; verify that the two are consistent; for major software packages, accept and publish supplementary contributor records that go beyond the byline.

JOSS is the maturity reference here and most other software-paper venues are moving toward similar practices. The CASRAI CRediT for software papers guide is updated quarterly with current practice.

What authors should do

For authors of code-only papers, four practical steps. First, distribute the Software role across the five sub-activities visibly, using the qualifier. Second, assign Methodology when the code implements a novel method. Third, maintain the CITATION.cff in the repository in parallel with the paper’s CRediT statement. Fourth, plan for the maintenance-credit question: who will maintain the code, how their contribution will be recognised over time, where the credit will live.

The CRediT taxonomy can support code-only outputs well, with attention. The work is in using the Software role thoughtfully, in interlocking it with Methodology and Writing where appropriate, and in maintaining the parallel record in the repository.

Related dictionary entries
May 15, 2026
Three CRediT misuses we see in submitted papers
CASRAI’s editorial network includes journal editors who handle CRediT statements daily, and we periodically aggregate the patterns of misuse they see. Three failures recur across disciplines, journal sizes, and submission systems. None are scandalous; all are correctable with attention. This post catalogues them with concrete examples and the editorial responses that work.

Failure one: role inflation

Role inflation is the most common CRediT failure by a wide margin. It is the practice of assigning every author every role, or near-every role, regardless of what they actually did. A typical inflated statement reads like a litany: Author A: Conceptualization, Methodology, Investigation, Formal analysis, Data curation, Writing – original draft, Writing – review & editing, Visualization, Supervision, Funding acquisition, Project administration. Author B: Conceptualization, Methodology, Investigation, Data curation, Writing – review & editing. Author C: Conceptualization, Methodology, Writing – review & editing. Every author is conceptualisation-positive; every author methodology-positive; every author writing-positive.

The pattern is recognisable and almost always wrong. Five authors did not all conceive the study. Five authors did not all design the method. Five authors did not all write the original draft. Role inflation reflects a misunderstanding of what CRediT is for: it treats the role assignment as a credit allocation (the more roles you have, the more credit you get), when CRediT is a description of contribution. As Liz Allen and the original CRediT designers were explicit, the taxonomy is meant to record what each contributor actually did, not to maximise their visible role count.

The editorial fix

Editors increasingly push back at submission. The Lancet‘s convention of requiring each author to write a prose contribution statement in their own words is unusually effective; it forces a moment of reflection on what the author actually did. Several other journals have adopted variations. The CASRAI CRediT authors guide includes a role-assignment worksheet that asks each author to write a one-sentence justification per role before the statement is finalised; the discipline of writing the justification surfaces most cases of role inflation before submission.

Where inflation has already made it into a submission, the editorial response is to ask the corresponding author to revise. The framing that works is methodological: “We use CRediT to describe what each contributor actually did. Please review the role assignments and confirm that each role corresponds to a substantive contribution by that author.” This is rarely contentious; in our experience the corresponding author tightens the statement on review.

Failure two: byline order substituting for qualifiers

The degree-of-contribution qualifier was added to NISO Z39.104 specifically to resolve byline-order disputes. A paper with three co-first-authors should mark them all as Equal on the roles they share; a paper with a clear lead on one role and supporting contributors on others should use Lead and Supporting accordingly. The qualifier is structurally what byline order has long tried to encode implicitly.

The misuse we see is statements that ignore the qualifier and rely on byline order or footnotes to communicate contribution magnitude. A typical example: a paper with five authors and a footnote saying “authors 1 and 2 contributed equally” but a CRediT statement that assigns roles without qualifiers, leaving the reader to infer what “equally” means across the roles. Is author 1’s Investigation equal to author 2’s Investigation? Is author 1’s Formal analysis equal to author 2’s Formal analysis? The footnote does not say; the unqualified CRediT statement does not say.

The editorial fix

Adopt the qualifier explicitly. If two authors contributed equally to a role, mark both Equal on that role. If one author was the lead and others supported, mark Lead and Supporting. Footnotes about equal contribution become redundant; the structured statement carries the information.

For journals, the editorial implementation is to require the qualifier in the submission system. The CRediT JATS specification supports the qualifier via the specific-use attribute; submission systems should expose this and require it. A few publishers have already moved here; we expect most to follow through 2026.

Failure three: missing writing roles

Every paper has someone who wrote the first draft. If a CRediT statement omits Writing – original draft, the editor will ask. This is the third recurring failure: statements that distribute Methodology, Investigation, Formal analysis, and Supervision but leave Writing – original draft unassigned.

The pattern usually reflects a real ambiguity. In a paper with three co-equal authors who jointly drafted, who gets Writing – original draft? The answer is all three, marked Equal. In a paper where a postdoc drafted under supervision and a senior author heavily revised, who gets which writing role? Almost always: postdoc gets Writing – original draft (lead); senior author gets Writing – review & editing (lead). In a paper where a paid medical writer drafted, the medical writer is typically not an author per ICMJE — they are acknowledged separately — and the authors who substantively shaped the draft get Writing – original draft as appropriate.

The editorial fix

Editors should treat “who wrote the first draft” as a required question at submission. The BMJ asks this explicitly. The CASRAI worksheet asks it. If the statement does not name a Writing – original draft contributor, the editor’s standard response is a one-line query: “Please indicate which author or authors discharged the Writing – original draft role; the role is currently absent from the CRediT statement.” In our editor network this query gets a fast, accurate response and the role is added before review proceeds.

Three lesser failures worth a paragraph each

Beyond the big three, three lesser failures are worth noting. First, conflating Methodology and Formal analysis: the role definitions distinguish these (Methodology is the study design; Formal analysis is the statistical or analytical work on the resulting data) and assigning both to the same person without distinction loses information. Second, assigning Software to anyone who touched a computer: Software is meaningful programming work, not opening Excel; if the contributor wrote no code, did not script the analysis, did not configure REDCap, they probably did not discharge the Software role. Third, missing Funding acquisition: someone wrote the grant. If the CRediT statement does not name a Funding acquisition contributor and the paper is grant-funded, the role is missing.

What CASRAI recommends

Four practical recommendations. First, use the role-assignment worksheet at the drafting stage, not at submission; it catches most misuse early. Second, require the degree-of-contribution qualifier in your journal submission system. Third, treat missing Writing – original draft as a default editorial query. Fourth, when in doubt about role inflation, ask each author to write a one-sentence justification per role; the discipline reveals the over-assignment naturally.

For the broader system, the most useful intervention is journal submission system support. Adoption at the policy level is now widespread, but the per-submission UX varies enormously. A submission system that prompts for qualifiers, validates that every role has a contributor, and asks per-author confirmation of role assignment catches most failures before they reach editorial review. We expect this UX to converge through 2026 as publishers update their Editorial Manager and ScholarOne configurations.

Related dictionary entries
January 15, 2026
Data papers, software papers, and the limits of CRediT
The 14 roles of CRediT were designed against the model of a conventional research article reporting empirical work: a study with a hypothesis, a method, data, analysis, and a written argument. Data papers and software papers fit this model awkwardly. A data paper describes a dataset; a software paper describes a piece of software. The intellectual contribution is the artefact itself, not the prose around it. The CRediT roles, applied to these papers, produce statements that are technically valid but substantively misleading. This post catalogues the friction and suggests where the taxonomy could be extended.

What a data paper actually is

A data paper, as the genre has developed in venues like Scientific Data, Earth System Science Data, GigaScience, and the data-paper streams of disciplinary journals, is a peer-reviewed description of a dataset: its provenance, its collection method, its quality, its access conditions, and its potential reuse. The dataset itself lives in a repository with its own DOI; the data paper provides the citable, peer-reviewed scholarly record that the dataset exists, that it was collected with rigour, and that it is fit for reuse.

The intellectual labour behind a data paper is mostly not in the paper. It is in the years of fieldwork or instrument operation that produced the data, the protocols that ensured comparability across collection events, the curation work that turned raw observations into a structured deposit, the documentation that lets a stranger understand what the data mean. The paper is a summary record of that work.

Where CRediT falls short for data papers

Three friction points. First, Investigation and Data curation bear most of the load and they are not differentiated finely enough. A field ecologist who spent years collecting samples, a lab technician who processed them, a data manager who normalised the schema, and a metadata specialist who wrote the documentation are all plausibly Investigation or Data curation; the roles do not distinguish them. The result is that two papers with very different actual contributorship patterns can have identical-looking CRediT statements.

Second, Resources overlaps with Investigation in a confusing way. A data paper describing a long-term ecological observatory has a Resources contribution (the observatory itself) that is distinct from the per-sample Investigation. CRediT does not currently cleanly separate “provided the infrastructure that produced the data” from “provided the samples that went into the data.”

Third, Writing – original draft is often the smallest contribution, not the largest, and assigning it Lead can misrepresent the contribution structure. The person who wrote the paper is often a relatively junior team member, not the senior person whose intellectual contribution was the protocol and the multi-year campaign.

Software papers and the JOSS model

Software papers, exemplified by the Journal of Open Source Software (JOSS), face an analogous problem from a different direction. A JOSS paper is short — often under 1,000 words — and is paired with a peer-reviewed software repository. The intellectual contribution is the software: its design, its implementation, its tests, its documentation, its maintenance over time. The paper is a stub.

JOSS itself uses CRediT for its papers and has done so since 2020. The community has converged on a set of mappings:
- Conceptualization covers software design and architectural decisions.
- Software covers implementation. This is the central role for most JOSS contributors.
- Validation covers testing, both unit tests and validation against reference implementations.
- Methodology covers the algorithmic content, where the software implements a non-trivial method.
- Writing – original draft covers the paper itself. The README, the developer documentation, and the user docs are also writing work, but they are not the JOSS paper.
- Supervision covers project leadership; Project administration covers maintenance and coordination.
The friction in this mapping is that the Software role is overloaded. It conflates the initial implementation, ongoing maintenance, bug-fixing, refactoring, and tooling. A contributor who implemented the core algorithm and a contributor who maintains the CI/CD pipeline both get “Software” with no further distinction. For long-lived software with many contributors over years, the role assignment ends up giving everyone Software (lead/equal/supporting) and the differentiation lives in the GitHub commit history, not in CRediT.

The FAIR4RS angle

The FAIR4RS Principles for research software, finalised in 2022, set out what FAIR means for software: findable, accessible, interoperable, reusable. They explicitly acknowledge that software citation needs a richer model than data citation, because software has versions, dependencies, and ongoing development that data typically does not.

FAIR4RS implies, though does not directly require, a richer contributorship taxonomy for software. The Software Citation Implementation Working Group has been chewing on this for several years. Their working position is that CRediT remains the right vocabulary for software paper contributorship, but that the software repository itself should carry its own contributor metadata using a complementary scheme — typically CITATION.cff with extended fields — that captures the per-version, per-component contributorship that CRediT cannot.

The mapping problem

For data papers and software papers, the operational reality is that two parallel records exist: the paper’s CRediT statement and the dataset or software repository’s contributor metadata. They overlap but do not align cleanly. The dataset DOI and software DOI live in DataCite; the paper DOI lives in Crossref; the relations between them are declared in the metadata but not always reciprocally.

The CASRAI research outputs domain tracks the mapping conventions in current use. Our recommendation, for now, is that data papers and software papers should publish a CRediT statement covering the paper’s contributorship and should additionally publish a richer contributor metadata file with the dataset or software, using CRediT roles plus the disciplinary-specific extensions that have emerged.

Possible extensions

Three extensions would meaningfully improve the situation. First, sub-roles within Software: an extended taxonomy with implementation, testing, documentation, maintenance, and integration as sub-roles would give a software paper a more truthful contributorship statement. This work has been drafted by the FORCE11 software citation working group but not formally proposed as a CRediT extension.

Second, distinguished Investigation roles for data papers: collection, processing, curation, documentation as sub-roles of Investigation and Data curation would let a data paper describe its contributorship more faithfully. The challenge here is keeping the taxonomy usable; an over-elaborate vocabulary loses adoption.

Third, artefact-level role assignments: the current CRediT statement applies at the paper level. For a paper that describes a dataset and a software package, it might be more useful to have role assignments at the artefact level (paper, dataset, software each get their own statement) with cross-references. This would require schema work in Crossref, DataCite, and ORCID.

What to do now

For authors of data papers, the practical advice is: use CRediT for the paper; deposit a complementary contributors.json with the dataset that captures finer-grained roles; cross-reference the two in the related-identifier blocks. For authors of software papers, use CRediT for the paper and CITATION.cff for the repository, with the CFF carrying the rich per-component contributor data. The CASRAI data and software papers guide has worked examples.

For the CRediT stewardship group, the recommendation is to prioritise the data-paper and software-paper mapping problem in the v2026.3 revision discussion. The friction is real, the workarounds are working but ugly, and the taxonomy will be strengthened by a thoughtful extension.

Related dictionary entries
December 17, 2025
Why the next CRediT version should include ‘AI assistance’ as a role
The 14 roles of CRediT were designed in 2013-2014 with a model of contribution that did not include large language models or generative AI systems. A decade on, the taxonomy is robust and widely adopted, but the AI question is hard to ignore. This post makes the case — tentatively, and with attention to the counter-arguments — that the next CRediT revision should add a 15th role explicitly covering AI assistance. We are publishing it here to invite community pushback before any formal proposal goes to the CRediT stewardship group.

Why this question is not solved by disclosure alone

The current consensus around generative AI in scholarly authorship rests on two pillars: AI cannot be a co-author (the ICMJE 2023 position), and AI use must be disclosed in a structured declaration. CASRAI agrees with both. They do not, however, resolve the question of how AI assistance shows up in CRediT.

A worked example. Suppose a paper has four authors. Author A wrote the first draft with substantial assistance from a large language model, which she prompted, edited, fact-checked, and revised. Author B ran the formal analysis using an AI-assisted statistical-discovery tool that proposed model specifications. Author C generated several of the figures using a GenAI visualisation tool. Author D supervised. Each used AI; each used it differently; each took human responsibility for the output. How does the CRediT statement represent this?

Under current CRediT, AI use is invisible. Author A gets Writing – original draft (lead). Author B gets Formal analysis (lead). Author C gets Visualization (lead). Author D gets Supervision. The AI assistance shows up only in the publisher-mandated AI disclosure, which is a free-text field in the methods or acknowledgements. The structured contributorship record has no place for the granular fact that AI was a tool in each of those role-discharges.

The proposed 15th role

The draft scope we are testing is this:

AI assistance. The use of artificial-intelligence systems, including generative AI, machine-learning models, and automated analytical tools, in the production of the work. Includes prompt engineering, model selection, validation of AI output, and human verification of AI-generated content. Does not include use of AI as a routine tool (e.g., grammar checkers, citation-formatting tools) below a disclosure threshold defined by the publisher.

The role would carry the standard degree-of-contribution qualifier. A human author whose primary contribution was prompting and verifying an AI system would be marked Lead for AI assistance; a co-author who occasionally checked AI outputs would be Supporting. The role would not be a substitute for the existing roles — the human who used AI for the first draft still gets Writing – original draft — but it would add the structured fact that AI was involved.

The arguments for

First, structured disclosure is more useful than prose disclosure. A free-text AI declaration cannot be queried, cross-referenced, or aggregated. A CRediT-style structured role can. Integrity offices investigating a fabrication can query for papers with AI assistance roles; funders tracking AI use in grant outputs can roll up the data; bibliometric studies can analyse patterns. None of this is possible with the current free-text disclosure.

Second, granularity matters for accountability. Knowing that a paper used AI is less useful than knowing which contributor used AI for which task. The CRediT role assignment makes the accountability specific. If a fabricated reference appears in the introduction, the question of who is responsible for verifying it has a structured answer.

Third, the boundary is becoming a fiction. Modern statistical workflows include AI components (autoML, AI-assisted exploratory analysis); modern writing workflows include AI components (Copilot for prose, Claude for editing); modern visualisation workflows include AI components. The pretence that these are separable from the role they support is increasingly hard to maintain. If AI is being used to discharge a role, the role assignment should say so.

The arguments against

Three serious counter-arguments deserve engagement.

First, the scope-creep concern. CRediT has held to 14 roles deliberately. Each addition raises the cognitive load on authors filling out the statement, increases the integration burden on publishers, and risks the taxonomy becoming unusable through over-specification. The argument from Liz Allen and the original CRediT designers has been that the taxonomy gains its value from being small enough to use.

Second, the boundary problem. What counts as AI assistance? A grammar checker is plausibly AI; a citation formatter increasingly is; a search engine ranking results by relevance certainly is. If every modern research tool counts as AI, the role becomes meaningless. A workable scope requires a non-trivial threshold (the draft language above gestures at “below a disclosure threshold defined by the publisher”), and that threshold is hard to define without ending up with either everything or nothing.

Third, the disclosure-versus-contribution distinction. CRediT is a contributorship taxonomy. AI is not a contributor — that is the settled position. Adding an AI role to CRediT risks blurring this. The alternative is to keep AI in a separate disclosure form, structurally similar to a competing-interests declaration or a funding statement, rather than in the contributorship statement.

A possible middle path

The middle path is to keep CRediT at 14 roles and to define a parallel AI assistance declaration with comparable structure: a controlled vocabulary of AI-use types, a per-contributor breakdown linked to ORCID iDs, a model-and-version field, and a verification statement. This would sit alongside CRediT in publisher submission systems and JATS XML, rather than inside it.

This is closer to where the current publisher disclosure forms are heading, and it preserves the conceptual clarity that CRediT roles describe what humans did, while a separate declaration describes what AI tools were used. We are increasingly inclined to recommend this path, with the caveat that the disclosure must be structured to the same standard as CRediT — not free-text, with controlled vocabularies, deposited to Crossref, and surfaced on ORCID.

What the CRediT stewardship group should do next

Three concrete steps. First, run a structured community consultation through 2026 on whether to add AI assistance as a 15th CRediT role, with the alternative being a parallel structured declaration. The CRediT governance page outlines the consultation process. Second, in parallel, draft the data model for a parallel AI assistance declaration so that the comparison is concrete and not abstract. Third, coordinate with NISO on whether either option requires a revision to Z39.104.

The decision is not urgent in the sense that the integrity system is failing today; the existing disclosure forms work, badly. It is urgent in the sense that every year of delay produces another year of unstructured AI-use data that cannot be aggregated or analysed, which makes the eventual transition harder.

Related dictionary entries
December 3, 2025
Crossref’s grant-linking initiative and CRediT: a 2025 status
Crossref’s Grant Linking System (GLS), in development since 2019 and in steady production since 2022, has quietly become one of the most useful bits of plumbing in scholarly metadata. Its 2025 expansion — covering more funders, more grant metadata, and tighter integration with Crossref’s content-registration deposit schema — is worth a closer look, particularly for anyone integrating CRediT contributorship with funding attribution. This post walks through what GLS does, what changed in 2025, and how a CRediT integrator should consume the data.

What GLS is, briefly

A grant, as a thing in the world, has a funder (an organisation that paid), one or more awardees (people and institutions), a title, a project description, an amount, a duration, and a set of outputs (papers, datasets, software, other artefacts) that the grant produced. Pre-GLS, each piece sat somewhere different. The funder lived in the Funder Registry (a Crossref-maintained list of funder organisations). The grant number was a free-text string in publisher metadata. The outputs were registered with DOIs at Crossref but without structured links back to the grant. The result was that the graph from funder through grant to output existed only in fragments.

GLS provides the missing middle layer. A funder registers grants with Crossref via a dedicated deposit schema; each grant gets a DOI; the grant DOI carries structured metadata about the funder, the project, the awardees (with ORCID iDs where available), and the institutions (with ROR IDs). Publishers and other depositors then reference the grant DOI from the output’s metadata, closing the loop.

2025 expansion: more funders, more metadata

The major story of 2025 was funder participation. Through 2024 the GLS depositors were a small set of early adopters (Wellcome, the Templeton Foundation, ANR, a handful of others). 2025 added the major UK research councils (UKRI’s component councils now register grants via GLS), several EU H2020 and Horizon Europe streams (via OpenAIRE-mediated deposit), the Australian Research Council, and — significant for the US ecosystem — initial NSF and NIH pilots. NIH’s pilot is small (a few thousand R01 grants), but it signals direction.

The metadata expansion was equally important. GLS 2025 added structured fields for: project abstract (free text but indexed), discipline classification (using Crossref-curated CODE FOR codes and OECD FoS), expected outputs, ethics-board identifiers, and — the field most relevant to CRediT integrators — a participants structure that names each grant participant with an ORCID iD and a role from a controlled vocabulary (principal investigator, co-investigator, collaborator, named researcher, fellow, named staff). This is not CRediT, but it interlocks with CRediT cleanly.

The CRediT-GLS interlock

Here is the integration pattern that is now possible. A grant is registered with GLS, getting a DOI and a structured participant list. A paper acknowledges the grant by including the grant DOI in its Crossref deposit. The paper’s JATS carries a CRediT contributor statement, which is also deposited to Crossref via the relationships block. ORCID consumes both deposits via the public API and can now answer: this researcher contributed to this paper in these CRediT roles; the paper acknowledges this grant; this researcher is on the grant participant list in this grant role.

The query is structured and unambiguous. Pre-2025, it required string matching grant numbers and best-effort author-name disambiguation. Post-2025, the entire chain runs on PIDs. The CRediT adoption ledger at CASRAI tracks which publishers deposit CRediT to Crossref in the form that makes this work, and which still drop the qualifiers at the deposit step.

What integrators need to do

For publishers depositing content with Crossref, the 2025 GLS recommendations are: include grant DOIs in the funding section of every deposit where a grant is acknowledged; resolve and validate the grant DOI before deposit; carry CRediT roles with the degree-of-contribution qualifier in the contributor section. Crossref’s submission schema 5.4 supports all of this; older schema versions do not, and a number of publishers are still on 4.x.

For institutional CRIS systems, the recommendation is to ingest grant DOIs into the funding record alongside the internal grant number, and to use the grant DOI as the join key when reconciling CRIS funding records against ORCID’s funding entries and Crossref’s content metadata. The CASRAI CRIS integration guide has been updated with the GLS ingestion patterns by major CRIS vendor.

For funders not yet depositing to GLS, the question is what to do about historical grants. Crossref’s recommendation is to deposit prospectively (new awards from a chosen start date) and backfill historical grants over time. The funders that have done well at GLS uptake budgeted a small data-engineering effort over 6-12 months to backfill 5-10 years of historical grants from internal records.

The RAiD-Crossref-GLS triangle

An open question for 2026 is the relationship between GLS and RAiD. Both can identify a research project; both can carry participant, institution, funding, and output metadata; both have ISO-standard or de-facto-standard status. The honest answer is that they overlap meaningfully but serve different communities and emphases.

GLS is funder-centric: a grant is the unit; the funder registers it; outputs reference it. RAiD is project-centric: a project can span multiple grants, multiple funders, multiple institutions, with the project itself the persistent unit. For a single-funder, single-project grant they are functionally identical. For a multi-funder collaboration (typical in clinical trials, large astronomy or particle physics projects, EU consortia), RAiD captures the project shape; the individual grants funding it would each be GLS-registered and the RAiD would reference them.

The Crossref-DataCite-ARDC working group has begun work on a formal crosswalk that lets a GLS grant DOI declare a RAiD that it contributes to, and vice versa. This will not collapse the two but will let consumers traverse the graph in either direction.

Consuming the data

The Crossref REST API exposes GLS grants under /works with a type filter of grant; the relationship from a paper to its grant is in the paper’s relation block with relationship type is-funded-by. For ORCID-aware consumers, the ORCID 4.0 funding resource now carries the grant DOI as a primary identifier, with the Funder Registry entry as the funder.

OpenAIRE consumes GLS deposits and exposes the resulting graph in its OpenAIRE Graph, which is the easiest single endpoint to query for the full funder-grant-output structure. For institutional consumers without the bandwidth to consume Crossref directly, OpenAIRE Graph is the recommended starting point.

What’s still rough

Two known limitations. First, the GLS participants structure does not yet carry CRediT roles directly; it carries grant participation roles, which are a coarser categorisation. This is by design — grant participation is not authorship — but it means that the question “which CRediT role did this person play on this grant” can only be answered indirectly, by intersecting grant participation with CRediT roles on the grant’s outputs. We expect this to be cleaned up in a future GLS schema revision.

Second, historical-coverage gaps remain. Pre-2020 grants are almost entirely absent from GLS; 2020-2023 coverage is partial; 2024 onward is increasingly complete. Tools building on the GLS graph need to handle the missing-grant case gracefully.

Related reading
November 19, 2025