Blog

  • Open peer review: signals, identifiers, attribution

    Open peer review has moved from radical experiment to mainstream option. Some form of transparency — published reviewer identity, published review content, post-publication peer review — is now offered by a majority of major journals as either default or opt-in. The CASRAI open peer review entry tracks the policy landscape; this post focuses on the integration layer that makes the practice work.

    What open peer review actually is

    The phrase covers several distinct practices that share a transparency commitment but differ in what is made transparent. Open identities: reviewer names are disclosed to authors or published with the paper. Open reports: review content is published alongside the paper. Open participation: peer review is conducted in public, with the broader community able to contribute. Open final-version commenting: post-publication commenting is supported as a continuation of review. Different journals combine these differently.

    The current state in 2026: open reports are widely offered (eLife, EMBO Journal, PLOS, Nature Communications, BMJ, F1000Research, Royal Society Open Science, and many others), open identities are less common as default but offered as an opt-in by many, open participation remains the most experimental, and open final-version commenting is supported on most major platforms but lightly used.

    The signal infrastructure

    For open peer review to integrate with the research-information ecosystem, three signals need to be carried through the metadata.

    First, the review-as-output signal. A peer review is itself a scholarly output. Crossref’s review-type DOIs, introduced in 2017 and now widely used, give each review a citable identifier. The review DOI is linked to the article DOI via the reviewed-relationship metadata. A reviewer can be credited for the review as a structured output, separately from any co-authorship.

    Second, the reviewer-identifier signal. ORCID’s peer-review activity record carries reviews by ORCID iD. A reviewer whose name is disclosed and who has consented to ORCID-record deposit gets the review entered into their ORCID profile, with the journal as the source and the verification provided by the publisher’s deposit. The CASRAI ORCID implementation guide walks through the deposit patterns.

    Third, the review-credit signal. The 2024 work on a structured taxonomy for peer-review contribution — distinguishing the actions of a reviewer (read, queried, recommended changes, validated computation, validated data) — has produced a working vocabulary that several journals now apply at review-submission time. The vocabulary is in the CASRAI research integrity domain.

    The attribution layer

    The attribution layer is where open peer review interlocks with broader research recognition. Pre-Publons-acquisition, the dominant pattern was that journal-published reviews counted as a recognised scholarly output but the cross-journal aggregation was patchy. Publons consolidated some of this; the post-acquisition (Clarivate-owned, now part of Web of Science) state is functional but not as integrated as the community would prefer.

    The current best practice is: review with open identity disclosed; review content published under a CC BY licence with a review DOI; review deposited to ORCID via the publisher’s member API; review surfaced in the reviewer’s narrative CV with appropriate context. The result is a reviewer recognition trail that supports promotion, tenure, and career-development assessments.

    For institutional research-administration offices, the implication is to capture peer-review contribution in CRIS systems and in researcher reporting. Several institutions have built peer-review dashboards from ORCID-deposit data; the practice is becoming standard at research-intensive universities.

    The CRediT interlock

    The CRediT taxonomy as currently constituted does not include a Peer Review role; peer review is treated as separate from authorship-related contributorship. There is a structural reason for this: peer review is a per-paper recognition that does not produce co-authorship; CRediT is the co-authorship contributorship taxonomy. Conflating them would muddy both.

    The clean separation is: CRediT for paper contributorship (author roles); review DOIs and ORCID peer-review records for reviewer recognition (separately). The two structures are complementary; a researcher’s CV should surface both. The CASRAI peer-review credit guide walks through the integration.

    The gaps still open

    Three gaps deserve attention in 2026.

    First, the cross-journal aggregation gap. Reviews live with the journals that solicited them; ORCID provides the per-reviewer view; but the cross-journal picture (what fraction of reviews in field X are openly published, what review-to-acceptance lag distribution exists, who is reviewing for whom) is harder to assemble. The OpenAIRE Graph has begun ingesting review-DOI data; the picture is improving but not complete.

    Second, the quality-signal gap. Open review content is variable in quality; the integration ecosystem treats all reviews as equivalent. A short, perfunctory open review and a substantial methodological critique both get a review DOI and an ORCID entry. The community has not yet developed quality signals for review content; doing so without producing perverse incentives is genuinely difficult.

    Third, the uneven adoption gap. The major open-publishing platforms have committed to transparency; many traditional journals offer open review as opt-in but with low uptake. A reviewer’s open-review track record is incomplete if many of their reviews are at journals that do not support open review. The trajectory is positive but uneven.

    What CASRAI recommends

    Four recommendations. First, journals should default to open reports with reviewer-identity opt-in; the default-opt-in distinction matters for uptake. Second, publishers should deposit reviews to Crossref and ORCID consistently, with the review-credit metadata. Third, institutions should capture peer-review contribution in their CRIS systems and surface it in researcher recognition. Fourth, the responsible-assessment community should treat substantial peer-review work as a legitimate and recognised contribution in narrative CVs and promotion dossiers.

    For reviewers, the practical advice is to opt for open identity where journal policy allows, to take the time to write reviews that are substantive enough to count as contributions in their own right, and to maintain their ORCID peer-review record. For authors, the practical advice is to engage seriously with open reviews when received — the public-facing nature is a feature, not a threat.

    The longer arc

    Open peer review’s mainstreaming is happening alongside, and partly in tension with, the broader concerns about reviewer burden and the sustainability of peer review as an unpaid scholarly contribution. The integration improvements — review DOIs, ORCID deposit, structured credit signals — make peer review more visible, but visibility alone does not solve the volume problem. The responsible-assessment community’s recognition of peer review as legitimate contribution is necessary; it is not sufficient. The next phase of the conversation will likely centre on reviewer compensation, reviewer-load capping, and the integration of peer review into institutional workload models.

    Related dictionary entries

  • Project IDs in 2026: RAiD adoption update

    The Research Activity Identifier (RAiD) crossed several adoption thresholds in 2024-2025. ISO 23527:2022 standardisation completed; the Australian Research Data Commons reached operational scale; UKRI integrated RAiD into its funding workflow; the EU’s HORIZON identifier work began aligning with RAiD; the ARDC-led international RAiD Steering Group brought together national service providers from Australia, New Zealand, UK, Canada, and several EU member states. The May 2026 picture is meaningfully different from the May 2025 picture. This post is an adoption update.

    Where RAiD is now

    RAiD is operational at substantial scale in Australia, where the ARDC operates the national RAiD service and integration with the Australian Research Council and the National Health and Medical Research Council has matured. Every ARC and NHMRC grant from 2024 onward has a RAiD; the integration is a routine compliance item.

    RAiD is operational in New Zealand via a national service implementation aligned with the ARDC model.

    RAiD is operational in the UK via a UKRI-operated service, with integration into the Je-S successor (the UKRI Funding Service that launched in 2023). UKRI grants from 2024 onward have RAiDs; backfilling of historical grants is in progress.

    RAiD is in pilot in Canada via a CRDCN-led initiative, with the Tri-Agencies (CIHR, NSERC, SSHRC) participating in design.

    RAiD has affiliated national service providers in the Netherlands, Germany, and Finland; full EU integration is in development through the EOSC Federation work.

    RAiD is not yet operational at scale in the US. NIH’s evolving project-identifier work overlaps with but is not identical to RAiD; harmonisation discussions are ongoing.

    What RAiD actually carries

    A RAiD record carries: the project’s name and description; its participants with ORCID iDs; its institutional affiliations with ROR IDs; its funding sources with Funder Registry or ROR identifiers and grant identifiers; its outputs with DOIs and other PIDs; its temporal span; its status. The record is mutable: as the project evolves, the RAiD record is updated to reflect new participants, new outputs, new affiliations.

    The mutability is the design choice that distinguishes RAiD from a per-event identifier like a DOI. A project is a living entity for years; its identifier needs to grow with it. The RAiD service architecture supports this via versioning: each update produces a new version of the RAiD record, with the old versions preserved as historical states.

    The interlock with other PIDs

    RAiD’s value is largely in the interlock layer. A RAiD record references the ORCID iDs of its participants; ORCID 4.0 carries the RAiDs of its researcher’s projects. A RAiD references the DOIs of its outputs; Crossref and DataCite metadata reference the RAiD via the relationship blocks. A RAiD references the Funder Registry IDs of its funders and the grant DOIs (where they exist) of its grants; the Crossref Grant Linking System grants reference the RAiD via the contributes-to relationship.

    The result is a structured graph: from a RAiD, an integrator can traverse to participants (ORCID), institutions (ROR), funders (Funder Registry/ROR), grants (grant DOIs), and outputs (DOIs). The graph is queryable. The OpenAIRE Graph already operationalises this for European projects; the CASRAI persistent identifiers domain tracks the broader integration.

    What’s working well

    Three operational patterns deserve flagging.

    First, funder-issued RAiDs. The pattern of the funder issuing the RAiD on award and the awardee inheriting it has worked well. The funder has the structured grant data; the awardee has the operational knowledge of the project. The funder issues the RAiD with the structured data they have; the awardee updates it as the project evolves. This minimises the burden on researchers and ensures RAiD coverage is complete for funded work.

    Second, institutional-CRIS integration. CRIS systems that ingest RAiDs from their researchers’ projects and propagate them to outputs as a metadata field have closed the project-to-output linkage that previously required string-matching grant numbers. The integration is straightforward; the value compounds over time as the historical record accumulates.

    Third, cross-funder collaboration. A project with multiple funders (typical in large clinical trials and EU consortia) can have a single RAiD referencing all the funders’ grants. This addresses a longstanding accounting friction where multi-funder projects appeared as multiple disconnected projects in funder reporting systems.

    What’s not working yet

    Three issues remain open.

    First, retroactive RAiDs for historical projects. RAiD coverage is forward-looking from each jurisdiction’s start date. Historical projects (pre-2022 or so) do not have RAiDs; building the historical record is a substantial data-engineering effort that no jurisdiction has fully completed.

    Second, international coordination. Different jurisdictions have different RAiD service providers, different operational arrangements, and slightly different metadata profiles. The RAiD Steering Group is working on harmonisation but the work is incomplete. A project that crosses jurisdictions may have RAiDs from multiple providers, with the integration between them not yet seamless.

    Third, the unfunded-project case. RAiD was designed around funded projects, with the funder as the natural issuer. Unfunded research activity (self-funded, doctoral student projects without grants, community-research projects without traditional funders) does not have a clear RAiD-issuance path. The RAiD service architecture supports researcher-issued RAiDs; the institutional and funder workflows have not fully accommodated this case.

    What integrators should do

    For institutions running a CRIS, the priorities are: ingest RAiDs into the project record; propagate to outputs as metadata; reconcile with ORCID’s funding and contribution data; surface in research-administration reporting.

    For publishers, the priority is to accept RAiDs in submission systems as a funding-reference option alongside Funder Registry entries and grant DOIs, and to deposit RAiDs to Crossref via the relationships block. Several publishers have done this; broader adoption through 2026 would be welcome.

    For funders that have not yet issued RAiDs, the priority is to evaluate the operational integration. ARDC’s documentation and the UKRI implementation are useful reference points. The integration is non-trivial but not large; institutions that have done it report it pays back within 18 months in reduced cross-system reconciliation effort.

    The broader pattern

    RAiD adoption is the latest instance of the persistent-identifier pattern: a structured identifier for a class of research entities, with an operational service to mint and resolve them, with metadata that interlocks with other PIDs, with adoption that takes years to reach scale but compounds in value as it does. ORCID took a decade to reach saturation; ROR took five years to reach the equivalent in its space; RAiD is plausibly on a five-to-seven-year trajectory to comparable coverage.

    For the CASRAI community, the practical posture in 2026 is to incorporate RAiD into integration designs from the outset, to track adoption by jurisdiction, and to advocate for adoption where the operational case is strong. The PID quartet of ORCID-ROR-RAiD-DOI is increasingly the foundation on which research-information integration is built; the more complete that foundation, the more useful the integration layer becomes.

    Related dictionary entries

  • Mentorship as a CRediT role: pro and con

    The CRediT Supervision role is broad. The role definition reads: Oversight and leadership responsibility for the research activity planning and execution, including mentorship external to the core team. The role bundles mentorship into Supervision, which leaves the question: should mentorship be a CRediT role of its own? This post lays out the arguments on both sides and proposes a mid-path.

    The case for

    Three arguments for a dedicated Mentorship role.

    First, visibility. Mentorship is a substantial intellectual and time-consuming activity that current CRediT-style contributorship statements largely render invisible. A senior researcher who mentored an early-career colleague through the discovery, the writing, and the navigation of peer review has contributed significantly to the paper; the current taxonomy captures this only through the catch-all Supervision role, which is also used for project oversight that is quite different in character.

    Second, career-stage equity. Mentorship contribution is most often delivered by mid-career and senior researchers to early-career ones, and is most often invisibilised in the way it currently is. Making it a CRediT role would help correct the under-recognition of mid-career mentorship work in promotion and tenure decisions. The mentorship and career stages domain at CASRAI tracks the assessment-side implications.

    Third, distinction from supervision. Project supervision (the senior researcher with PI responsibility) and mentorship (the senior or peer researcher who guided a junior contributor’s development through the work) are different activities. Bundling them into one role loses the distinction. A paper where the PI did the supervision and a separate mid-career colleague did the mentorship has a contributorship structure that current CRediT cannot express cleanly.

    The case against

    Three arguments against.

    First, taxonomic stability. CRediT has held to 14 roles deliberately. Each addition raises cognitive load and risks the taxonomy becoming unusable through over-specification. Liz Allen and the original CRediT designers have consistently argued that the taxonomy gains value from being small enough to use; adding a Mentorship role pushes against this.

    Second, boundary problems. What distinguishes mentorship from supervision, from teaching, from co-authorship, from collaboration? The lines are real but fuzzy. A senior colleague who reviewed the draft and suggested major revisions is doing Writing – review & editing; the same colleague who guided the junior author through how to think about the discovery is doing mentorship; in practice the activities overlap. A role that requires reviewers to distinguish them may produce noise more than signal.

    Third, recognition versus contribution. CRediT is a contributorship taxonomy, describing what people did on the paper. Mentorship is broader than per-paper contribution; it is a sustained relationship that spans many papers and many years. Capturing per-paper mentorship in CRediT may be the wrong instrument; a separate mentorship-recognition mechanism (in narrative CVs, in promotion dossiers, in institutional mentorship programmes) may fit better.

    A proposed mid-path

    We propose a mid-path that addresses the visibility and equity concerns without expanding the CRediT role count.

    First, clarify the Supervision definition. The current definition bundles mentorship with project leadership. The bundling could be unbundled within the existing role through definitional refinement: the role description could be revised to explicitly recognise mentorship as a sub-activity within Supervision, with guidance on when each is being discharged. This is a low-cost intervention that does not require a new role.

    Second, add a structured qualifier for Supervision. The existing degree-of-contribution qualifier already provides lead/equal/supporting. A sub-qualifier indicating whether the Supervision was project-oriented, mentorship-oriented, or both, would add the granularity without adding a role. This is a small schema change with substantial value.

    Third, build the recognition layer outside CRediT. The narrative-CV format, mentorship-specific recognition programmes, and institutional career-development frameworks should carry mentorship recognition at a sustained-relationship granularity that CRediT cannot. The mentorship recognition that early-career researchers most value is not the per-paper notation; it is the cumulative recognition of mentorship across a career. The CASRAI institutional mentorship guide walks through the recognition options.

    What CRediT v2026.3 should do

    Our recommendation for the v2026.3 revision discussion: do not add a Mentorship role; do refine the Supervision definition to recognise mentorship explicitly; do add a sub-qualifier capturing the project/mentorship/both distinction; do coordinate with the narrative-CV and institutional-recognition communities to ensure that the cumulative mentorship recognition picture is captured outside CRediT.

    This is the position we lean toward, with the explicit acknowledgment that reasonable people disagree. The discussion at the December 2025 CRediT stewardship meeting was substantive; the community consultation through 2026 will be the place to settle it. The CASRAI CRediT governance page tracks the consultation process and welcomes input from the broader community.

    A broader observation

    The mentorship question is one instance of a broader pattern. CRediT, as a per-paper contributorship taxonomy, captures certain things well and certain things less well. The work that spans papers (sustained mentorship, leadership of a community, contribution to standards, infrastructure stewardship) does not fit naturally into a per-paper taxonomy. The right response is not to expand CRediT to cover everything but to build complementary recognition mechanisms for what CRediT does not capture.

    This is the argument running through the responsible-assessment community, the narrative-CV adoption push, and the CoARA reform agenda. CRediT is part of the picture, not the whole picture. A senior researcher’s contribution profile is captured by CRediT statements on their papers, by their narrative CV, by their teaching record, by their mentorship record, by their service to the community. The integrated picture is the goal; CRediT is one component.

    Practical recommendations

    Three for institutions. First, capture mentorship in your institutional records and recognition systems; do not wait for it to be a CRediT role. Second, train promotion-and-tenure committees to read mentorship contribution explicitly when reviewing dossiers. Third, support narrative-CV formats that surface mentorship.

    Three for researchers. First, claim your mentorship contribution in narrative CVs and professional records; do not depend on per-paper CRediT to capture it. Second, in CRediT statements, use Supervision appropriately and consider noting the mentorship dimension in the prose contribution statement that accompanies the structured CRediT. Third, contribute to the CRediT consultation if you have a view on the question.

    Three for the CRediT stewardship community. First, run the v2026.3 consultation transparently and document the outcomes. Second, coordinate with the responsible-assessment community on the broader recognition picture. Third, treat the question of taxonomic expansion as a serious one with substantive trade-offs, not as a routine update.

    Related dictionary entries

  • Cross-institutional CRIS interoperability: the CERIF-Pure-VIVO triangle

    Current Research Information Systems (CRIS) have been a critical institutional infrastructure layer for two decades, capturing researchers, their outputs, their funding, their collaborations, their projects. The three dominant data models in 2026 are CERIF (the European standard model, maintained by euroCRIS), Pure (the Elsevier-operated CRIS, with the largest market share in research-intensive universities), and VIVO (the open-source community-maintained CRIS with strong North American adoption). The three are convergent in intent and divergent in detail. This post is a practical guide to interoperating across them.

    What each model is

    CERIF is the Common European Research Information Format, maintained by euroCRIS since 2002. CERIF is a data model, not a system; it specifies the entities and relationships a CRIS should track and how they should be expressed in XML or RDF. CERIF-CRIS systems exist in many forms (the original CERIF reference implementations, Elsevier’s Pure with CERIF compliance, various national-system implementations) and CERIF compatibility is the lingua-franca claim in the European CRIS market.

    Pure is the dominant commercial CRIS product, used by hundreds of research-intensive universities globally. Pure has its own data model, which is broadly CERIF-compatible but with vendor-specific extensions and refinements. Pure’s market position means its data model functions as a de facto standard regardless of its formal status.

    VIVO is the open-source community-maintained CRIS originally developed at Cornell and now maintained by an international community under the DuraSpace umbrella. VIVO is built on a semantic-web foundation with an RDF/OWL ontology and explicit federation-friendly design. VIVO has strong adoption in US research universities and a growing international community.

    Where the models align

    The three converge on the core entities. All three model researchers (with ORCID iDs), organisations (with ROR IDs), publications (with DOIs), projects (with funding metadata), and the relationships between them. The CRediT roles can be expressed in all three. The funder-grant-output structure is representable in all three. For the 80% of routine queries against a CRIS, all three produce comparable answers.

    The convergence has been substantially driven by external standards. ORCID, ROR, DOIs, Crossref Funder Registry, CRediT, RDA DMP Common Standard — these external persistent-identifier and metadata standards have pulled the CRIS models toward common representations even where the internal models differ. The CASRAI research information systems domain tracks the convergence.

    Where the models diverge

    Three areas of substantial divergence.

    First, granularity of activities. CERIF models a wide range of research activities at different granularities (projects, work packages, deliverables, milestones); Pure focuses on the publication-centric workflow with project as a supporting entity; VIVO’s ontology accommodates both but is community-extended in ways that vary by deployment. An institution moving CRIS from one platform to another typically loses or transforms activity-level data in ways that require careful migration planning.

    Second, contribution and contributorship. CERIF’s contributor structure has evolved to carry CRediT roles natively. Pure carries CRediT but with vendor-specific extensions. VIVO’s ontology can express CRediT but the per-deployment representation varies. A research output with structured contributorship in one CRIS may lose detail when exported to another.

    Third, extension and customisation. Pure customers heavily customise their deployments with institution-specific fields and workflows; VIVO sites likewise extend the ontology. The customisations are valuable locally and problematic for cross-institutional interoperability. A federated query that works at one institution may return different fields at another, even where both claim CERIF compliance.

    The interoperability layer

    The practical interoperability layer in 2026 runs through three exchange mechanisms.

    OpenAIRE-CRIS is the European interoperability profile for CERIF, defining a subset of CERIF that all participating CRIS systems can emit and consume. OpenAIRE consumes CERIF-CRIS feeds via OpenAIRE-CRIS and incorporates them into the OpenAIRE Graph. Most European institutional CRIS systems can produce OpenAIRE-CRIS-compliant feeds with modest configuration.

    ORCID-CRIS integration is the per-researcher exchange channel. A CRIS depositing publication and affiliation data to ORCID, and consuming corrections back from ORCID, becomes a node in the ORCID-anchored researcher record. All three major CRIS models support ORCID integration, though the depth varies.

    Crossref event data and citation feeds provide the publication-level exchange. A CRIS that ingests Crossref event data picks up post-publication corrections, citations, and relationship updates that the local CRIS would otherwise miss.

    The three exchange mechanisms together cover most of what cross-institutional interoperability requires. They do not cover the activity-level data that diverges across CRIS models; that data remains harder to interoperate.

    What institutions should do

    For institutions selecting or migrating a CRIS, the practical recommendations are: prioritise CERIF compliance regardless of vendor; require ORCID integration; require Crossref event-data ingestion; verify OpenAIRE-CRIS compliance for institutions with European funder reporting obligations; insist on data-export capability that includes the full activity-level data, not just the publication-centric subset.

    For institutions operating an established CRIS, the priorities are to keep the integration layers current (ORCID 4.0 transition, OpenAIRE-CRIS profile updates, Crossref REST API consumption), to invest in metadata-quality QA, and to participate in the CERIF, Pure, or VIVO community work to influence the data-model evolution.

    For CRIS vendors, the priorities are to honour the convergent standards (ORCID, ROR, CRediT, OpenAIRE-CRIS) without burying them under vendor-specific extensions, and to make data export and import paths reliable across customer transitions. The market would benefit from less lock-in friction; the standards work supports that direction.

    The euroCRIS-DuraSpace-Elsevier triangle

    Beneath the technical layer is an organisational layer. euroCRIS as a standards body, DuraSpace as the VIVO open-source community steward, Elsevier as the Pure operator — these three together substantially set the direction of CRIS evolution. The 2024-2025 coordination work (visible in the joint CERIF-VIVO ontology alignment, the Pure CERIF-compliance certification process, the OpenAIRE-CRIS profile refinement) has been more productive than the prior decade.

    The convergence is incomplete and uneven, but the direction is clear. By 2028, cross-CRIS interoperability for the standard entities (researchers, outputs, projects, funding) should be a routine technical exercise, not a multi-year integration project. The activity-level interoperability will follow more slowly.

    Related dictionary entries

  • Welcome to CASRAI on its new home

    The CASRAI site has migrated to a modern headless architecture: WordPress manages the editorial layer at /wp/; Next.js / Faust.js renders the public site with full Schema.org markup, machine-readable bulk-data endpoints, and a tightly-federated link structure to our partner standards bodies. Everything is open-source CC-BY 4.0.

    If you previously bookmarked the old casrai.org, every legacy URL still resolves — and there is far more on the new site to explore.

  • CASRAI Dictionary v2026.1 — first release

    After two decades of standards stewardship, CASRAI ships its first modernised dictionary release: v2026.1 contains 714 entries across 20 thematic domains organised into five tracks (contribution, identifiers, data & methods, compliance, assessment). Every entry has a stable URI, Schema.org DefinedTerm markup, and JATS / JSON-LD encoding ready for any CRIS, repository, or publisher pipeline.

    Read the State of CRediT 2026 annual report for the broader picture, or browse the dictionary directly at /dictionary.

  • Hello world!

    Welcome to WordPress. This is your first post. Edit or delete it, then start writing!

  • CRediT for AI-generated content: where the line is

    The ICMJE 2023 position is settled: artificial-intelligence systems cannot be authors. The follow-on question that journals and authors continue to negotiate is how to represent, in a contributorship statement, the human work that goes into producing AI-assisted content. When a co-author prompts an LLM to draft a section, verifies the output, edits it, and stands behind it, which CRediT role describes their contribution? This post proposes a working line.

    The shape of the question

    Three scenarios make the question concrete.

    Scenario one: an author uses an LLM to polish prose in a draft they wrote. The intellectual content is theirs; the language is partly the model’s. The CRediT role is straightforwardly Writing – original draft for the author.

    Scenario two: an author uses an LLM to draft a first version of a section, which they then heavily revise. The first draft is the model’s; the final draft is the author’s, but the model substantively shaped what the final draft says. The CRediT role is still Writing – original draft for the author, but the contribution is meaningfully different from scenario one.

    Scenario three: an author uses an LLM to propose a study design, which they then refine. The intellectual content of the methodology was partly the model’s. The CRediT role is Methodology for the author, but again the contribution is meaningfully different from the unaided version.

    In all three, the human author is the role-holder; the model is not a co-author. What is different across the scenarios is the magnitude and the character of the human contribution. CRediT, as currently constituted, does not distinguish these.

    The working line

    Our proposed line is the verification-and-responsibility threshold. A human contributor who has substantively verified the AI-generated content, taken responsibility for it, and is prepared to defend it in correspondence or post-publication discussion is properly credited with the relevant CRediT role. The role describes what they contributed to the paper, which includes verification work even if the first-draft work was the model’s.

    The line shifts where the human contribution is insubstantial — a contributor who pasted a prompt, accepted the output without verification, and added their name to the paper has not discharged the role and should not be credited. This is the same line that has always applied to non-AI cases (a co-author who did not contribute should not be credited; gift authorship is a well-recognised failure mode).

    The line is therefore not about AI use per se; it is about whether the human contribution clears the substantive-contribution threshold. AI use does not displace the threshold; it changes what discharging the role looks like in practice.

    Disclosure runs parallel

    The disclosure of AI use is a separate question, addressed via publisher-mandated AI disclosure declarations. The disclosure says what tools were used and for what; the CRediT statement says who contributed what to the paper. The two run parallel and are both required by most major publishers in 2026. The CASRAI AI disclosure for authors guide walks through the publisher-by-publisher requirements.

    Implications for specific roles

    Writing – original draft

    The most common case. A human author whose draft was AI-assisted is properly credited with Writing – original draft if they verified the content, took responsibility for it, and produced the version that is the paper. The disclosure declaration says the AI was used; the CRediT statement names the human as the writer-of-record.

    Methodology and Formal analysis

    More delicate. If an AI-assisted statistical-discovery tool proposed a method or an analytic approach, the human contributor’s role is partly verification (was the proposal sound?) and partly extension (refining the proposal into the actual method). The CRediT role is still Methodology and/or Formal analysis for the human, but the verification dimension is foregrounded. If the human did not verify — accepted the AI proposal without independent assessment — the contribution is weaker and may not clear the role threshold.

    Investigation

    A subtle case. AI-assisted data extraction (e.g., from imaging, from medical records, from text corpora) involves a human contribution that runs from setup through verification to interpretation. Investigation includes the data-gathering activity; an AI-assisted version still has a human Investigation lead, who is responsible for the setup, the verification of extracted data, and the handling of errors.

    Validation

    Perhaps the most directly affected. Where AI tools are used for cross-checking, sensitivity analyses, or reproduction of results, the human Validation contributor is responsible for setting up the validation, interpreting its results, and acting on discrepancies. The AI does the mechanics; the human does the judgement.

    Visualization

    AI-assisted figure generation is increasingly common. The human Visualization contributor is responsible for the figure-design decisions, for verifying that the AI-generated figure accurately represents the data, and for the final version that appears in the paper. Where the AI generated an image that the human did not substantively verify, the threshold may not be cleared.

    The role-as-recognition trap

    A failure mode to flag explicitly. The temptation, when AI did most of the actual production work, is to inflate the human contributor’s role assignment to compensate. “The AI wrote the draft, but I prompted it, so I should still be Lead on Writing – original draft.” This is a misreading. The CRediT role is a description of contribution; if the human contribution was “prompted and accepted”, that is a smaller contribution than “drafted, verified, revised, took responsibility.” Calling both “Lead” obscures the difference.

    The remedy is the degree-of-contribution qualifier. A human contributor whose AI-assisted contribution was substantial may be Lead; one whose contribution was lighter may be Supporting. The qualifier discipline forces an honest assessment of magnitude.

    Where this leaves the AI-assistance-role question

    We have argued elsewhere that a 15th CRediT role explicitly for AI assistance is worth considering. The argument from this post is partly orthogonal: the existing 14 roles can accommodate AI-assisted work if the verification-and-responsibility threshold is honoured and the qualifier is used honestly. The case for a 15th role rests on whether the structured disclosure-of-AI-use is better placed inside the contributorship statement or outside it. Reasonable people disagree; we lean toward keeping AI disclosure parallel to CRediT rather than inside it, with attention to the verification-and-responsibility line.

    Practical recommendations

    Three for authors. First, treat AI assistance as a tool, not a substitute. Verify, edit, and take responsibility for what appears in the paper. Second, assign CRediT roles based on what you contributed including verification, not based on what the AI produced. Third, disclose AI use in the publisher-mandated declaration; the disclosure runs parallel to CRediT, not inside it.

    Three for editors. First, treat the verification-and-responsibility threshold as the operating standard for AI-assisted contributorship. Second, require both the CRediT statement and the AI-use disclosure at submission. Third, where a contributorship statement looks like it may reflect AI-assistance role inflation, ask the standard editorial question: what did this contributor actually do?

    Three for the broader system. First, harmonise AI-disclosure formats across publishers (work the NISO and COPE community has begun). Second, maintain the contributorship-versus-disclosure separation; do not collapse them. Third, evaluate the case for a 15th CRediT role on its merits, including the costs of taxonomic expansion.

    Related dictionary entries

  • How the Software role applies to code-only outputs

    A growing fraction of research output is code: software libraries that implement a method, computational notebooks that demonstrate an analysis, simulation frameworks that enable a body of work, infrastructure tooling that supports a research community. When the output is primarily code, the CRediT Software role carries weight that the role’s brief definition does not fully prepare it for. This post is a practical guide to assigning Software in code-centric contexts.

    The Software role, briefly

    The CRediT Software role is defined as: Programming, software development; designing computer programs; implementation of the computer code and supporting algorithms; testing of existing code components. The definition is short and was written with software-as-tool-for-a-paper in mind, not software-as-the-paper.

    For a conventional research paper where someone wrote analysis code that supported the science, Software is straightforward: the person who wrote the analysis code gets the role. For a paper whose primary scholarly contribution is the code itself — a JOSS paper, a software-methods paper, a tool announcement — Software is the dominant role and the brevity of the definition starts to bite.

    What the Software role should cover in a code-only context

    Our recommendation, distilled from the practice of JOSS, the Software Sustainability Institute, the Research Software Engineers community, and several years of CASRAI editorial work, is to read Software in code-only contexts as encompassing the following five sub-activities, all of which should be visible in the contributorship statement even if they share the role.

    Implementation: writing the production code itself. This is the core of Software and is what people most naturally associate with the role.

    Architecture and design: the higher-level decisions about how the code is structured, what its dependencies are, how its modules interact. In a code-only paper, architecture is part of the intellectual contribution and the architect should be a co-author with Software role.

    Testing: writing the test suite, including unit tests, integration tests, and regression tests. A code-only paper with a credible test suite has someone who built it.

    Documentation: user-facing documentation, developer-facing documentation, README, examples, tutorials. For code intended for reuse, documentation is part of the deliverable; the documentation contributor gets the Software role.

    Packaging and release: the engineering work of making the code installable, citable, and citation-resolvable. CI/CD configuration, dependency management, release-tagging, DOI registration. For long-lived code with multiple releases, this is sustained work; for a one-off code release accompanying a paper, it is still non-trivial.

    Each of these is meaningful contribution that the Software role captures. A code-only paper’s CRediT statement should make the distribution of these activities across contributors visible, using the lead/equal/supporting qualifier to express relative magnitude.

    Where Software overlaps with other roles

    Three overlaps deserve attention.

    First, Software versus Methodology. If the code implements a novel method, the method itself is a Methodology contribution; the implementation is a Software contribution. The same person often discharges both, and the contributorship statement should assign both roles to them. The error to avoid is conflating the two: assigning Software while omitting Methodology under-represents the intellectual contribution.

    Second, Software versus Validation. Writing tests is Software (per the definition); validating the code against reference implementations or independent data is Validation. The distinction is genuine: tests verify that the code does what the developer intended; validation verifies that the code does what is scientifically correct. Both belong in a code-only paper’s contributorship.

    Third, Software versus Writing – original draft. The README, the developer documentation, the API reference — these are documentation, captured under Software. The paper itself, including its method description and its discussion of design choices, is captured under Writing – original draft. The boundary is the publication artefact: anything in the paper is Writing; anything in the code repository is Software.

    Cross-referencing with CITATION.cff

    The CITATION.cff convention, increasingly standard in scientific software repositories, provides a richer contributor model than CRediT alone. CFF supports author, contact, and contributor entries with type-of-contribution fields; integrators have extended it with CRediT-aligned vocabularies. The recommended pattern for a code-only paper is to maintain both: a CRediT statement in the paper (for the paper-level contributorship) and a CITATION.cff in the repository (for the per-version, per-component contributorship that CRediT cannot express).

    The two should be consistent. A contributor named in the paper with Software role should appear in the CITATION.cff with at least equivalent contribution; a contributor named in the CITATION.cff but not in the paper should be acknowledged in the paper’s acknowledgements section. The CASRAI CITATION.cff entry walks through the integration patterns.

    The maintenance question

    An unresolved aspect of Software in code-only contexts is how to credit maintenance over time. A research software package may have a paper at first release, with a CRediT statement reflecting the founding contributors. Five years and several major versions later, the package has new maintainers, new contributors, and a substantially different code base. The original paper’s CRediT statement is increasingly out of date.

    The current pragmatic answer is: the paper’s CRediT statement freezes at publication; the CITATION.cff in the repository tracks current contributorship; downstream citation should reference both, with the paper as the publication-of-record and the CFF as the current-contributor record. This works but is imperfect. The Software Citation Working Group has been chewing on whether per-version CRediT statements, deposited to Crossref via the related-identifier mechanism, would be a cleaner answer; the proposal is technically viable but not yet a community consensus.

    What journals should do

    For journals publishing software papers, the recommended editorial practices are: require CRediT with qualifiers in the paper; require a CITATION.cff in the linked repository; verify that the two are consistent; for major software packages, accept and publish supplementary contributor records that go beyond the byline.

    JOSS is the maturity reference here and most other software-paper venues are moving toward similar practices. The CASRAI CRediT for software papers guide is updated quarterly with current practice.

    What authors should do

    For authors of code-only papers, four practical steps. First, distribute the Software role across the five sub-activities visibly, using the qualifier. Second, assign Methodology when the code implements a novel method. Third, maintain the CITATION.cff in the repository in parallel with the paper’s CRediT statement. Fourth, plan for the maintenance-credit question: who will maintain the code, how their contribution will be recognised over time, where the credit will live.

    The CRediT taxonomy can support code-only outputs well, with attention. The work is in using the Software role thoughtfully, in interlocking it with Methodology and Writing where appropriate, and in maintaining the parallel record in the repository.

    Related dictionary entries

  • Carbon-aware computing for academic HPC clusters

    Academic high-performance computing has a material climate footprint. A modern HPC cluster running at scale draws power in the megawatt range; the embodied carbon of the hardware, the operational carbon of the grid electricity, and the cooling overhead together produce annual emissions comparable to a mid-sized industrial facility. The sustainable-research community has been working on this since the late 2010s; 2026 is the year that carbon-aware computing moved from research interest to operational practice at academic clusters. This post walks through what’s happening and what cluster operators should be doing.

    What carbon-aware computing means

    Carbon-aware computing is a family of techniques for reducing the carbon footprint of computational work without reducing the work itself. The techniques include: temporal shifting, running non-urgent jobs during periods of low-carbon-intensity grid electricity; geographic shifting, running jobs at facilities with cleaner local grids; load-following, scaling cluster capacity with grid carbon intensity; efficiency improvements, doing more work per kilowatt-hour through hardware and software optimisations; demand reduction, eliminating redundant or wasteful computation.

    The CASRAI carbon-aware computing entry tracks the terminology and the academic community’s evolving vocabulary.

    What’s changed in 2025-2026

    Three things converged in 2025-2026 to move carbon-aware computing into practical academic deployment.

    First, real-time grid carbon-intensity data became reliable. The Electricity Maps API, Tomorrow’s national emissions data, and several regional grid operators’ direct data feeds now provide sub-hourly carbon-intensity data for most major grids. Scheduling decisions can be made on near-real-time information, not on average historical data.

    Second, scheduler integrations matured. Slurm, PBS Pro, and the major HPC schedulers now have plugin or integration paths for carbon-aware scheduling decisions. The plugins consume carbon-intensity feeds and influence job dispatch decisions based on configurable policies. The integrations are not yet universal but are no longer bespoke.

    Third, institutional commitments matured. The major UK research councils’ joint commitment to net-zero research by 2040, the EU’s broader sustainability-in-research push under the European Green Deal, several US universities’ institutional net-zero commitments — these created the policy mandate that aligns with the technical capability.

    What clusters are doing

    A non-exhaustive tour of the patterns we see at academic clusters in 2026.

    Temporal scheduling for batch jobs. Most clusters have substantial batch workloads where the deadline is days or weeks out. Carbon-aware schedulers shift these jobs to grid-low-carbon windows. The University of Edinburgh’s ARCHER2, the Stuttgart HLRS cluster, and the Berkeley Lab NERSC system have all reported carbon savings in the 15-25% range from temporal shifting without measurable impact on time-to-result for affected jobs.

    Geographic shifting for cloud-burst capacity. Clusters with cloud-burst arrangements for peak loads are increasingly directing burst capacity to cloud regions with cleaner grids. The carbon savings here are large per job but only apply to the burst fraction.

    Idle reduction. The least glamorous and most impactful intervention. Clusters typically have substantial idle capacity due to scheduling fragmentation; running fewer nodes more efficiently produces direct emissions reduction. The pattern is to consolidate workload onto fewer nodes during low-demand periods and power down the rest, which requires the ability to bring nodes back up reliably when demand rises.

    Hardware efficiency. The energy-per-flop trajectory in HPC hardware has been favourable; recent-generation hardware is materially more efficient than 5-year-old hardware. The cluster-refresh-cycle question becomes a sustainability question: when does the embodied carbon of new hardware get amortised by the operational savings? Mark Allen and the Green Software Foundation have published useful frameworks here.

    Software efficiency. Often-overlooked. A scientific code that uses 30% less compute for the same result delivers a 30% emissions saving. Code-efficiency efforts at HPC centres (profiling, algorithmic improvements, library updates) have outsized impact. The Software Sustainability Institute has been advocating this for years and is finally getting traction.

    The reporting and accounting layer

    An emerging challenge is how to report computational carbon to funders and institutional sustainability offices. The CodeCarbon library, ML CO2 calculator, and several others provide per-job carbon-estimation tools. The estimates are approximate but useful at the order-of-magnitude level. Major HPC centres are now publishing annual carbon reports; the methodology varies and harmonisation work is underway via the Green HPC working group.

    The CASRAI sustainable research domain is tracking the reporting standards. Our recommendation is that funders should ask for computational carbon estimates in proposals for compute-intensive work, with the estimate framed as a planning aid rather than a hard constraint.

    What researchers should do

    Three practical recommendations for researchers running compute-intensive work.

    First, profile your code. The single highest-impact intervention is identifying the parts of the workflow that consume disproportionate resources. The Performance Optimisation and Productivity (POP) network in Europe and similar initiatives elsewhere provide free or low-cost profiling support. A well-profiled and reasonably-optimised code typically achieves 1.5-3x the throughput-per-kwh of an unprofiled version of the same workflow.

    Second, use carbon-aware schedulers where available. If your cluster supports temporal shifting, mark jobs as deadline-flexible where they genuinely are. The scheduler will exploit the flexibility; the carbon savings accrue without effort on your part.

    Third, report and account. Include computational-carbon estimates in your project’s environmental reporting. Make the cost visible. The cultural shift that follows visibility is the longest-term impact.

    What institutions should do

    For institutional HPC operations, the 2026 priorities are: deploy carbon-aware scheduling; publish annual carbon reports with methodology disclosure; integrate computational-carbon estimation into the user-facing portal; participate in the inter-institutional benchmarking and best-practice exchange via the Green HPC working group.

    For institutional sustainability offices, the priority is to bring research computing into the institutional carbon accounting. Many institutional net-zero commitments under-count or omit research computing; this is a material reporting gap.

    For funders, the priority is to recognise sustainability as a legitimate cost item in compute-intensive grants and to use the proposal-stage carbon estimation as a planning input rather than a punitive metric. UKRI’s 2024 sustainability-in-research guidance is a useful model.

    The honest limits

    Carbon-aware computing reduces but does not eliminate HPC’s footprint. A genuinely net-zero research-computing posture requires either grid decarbonisation (largely outside HPC operators’ control) or computational-demand reduction. The demand-reduction conversation is uncomfortable — large language model training, climate modelling at very high resolution, large-scale molecular dynamics — but it is increasingly unavoidable. The sustainable-research community needs to have it without flinching, while continuing the technical work that makes the unavoidable computational work as low-impact as feasible.

    Related dictionary entries