Tag: CRediT taxonomy

  • How to write a CRediT author statement: a step-by-step guide

    The CRediT author statement has moved from novelty to routine: a large majority of major publishers now ask for one at submission, and many will not advance a manuscript without it. Yet it is still frequently drafted in the last hour before submission, by one author, from memory. That is a missed opportunity, because a statement assembled carelessly does exactly what a contribution taxonomy is meant to prevent. This guide sets out a step-by-step method for producing one well. The authoritative how-to lives at how to write a CRediT author statement, and this article walks the same ground in practice.

    First, know what CRediT is — and is not

    Before assigning a single role, fix two facts in mind. CRediT is a controlled vocabulary of fourteen contributor roles, each with a canonical definition and a stable identifier, maintained as a NISO standard. It records what people did. It is not a definition of authorship, and it is not a scoring system. The decision about who qualifies as an author is made separately, under the ICMJE criteria in biomedical fields and equivalent norms elsewhere; CRediT supplements that decision and does not replace it. Conflating the two is the most common error in this area, and it is worth reading the full account of authorship and accountability alongside this guide.

    The fourteen roles fall into four loose functional groups that make a useful checklist: planning and design (Conceptualization, Methodology, Software); research and analysis (Validation, Formal analysis, Investigation, Resources, Data curation); communication (Writing – original draft, Writing – review & editing, Visualization); and management (Supervision, Project administration, Funding acquisition). The canonical definitions are set out at the CRediT roles, and you should assign against those definitions rather than against your intuition about what a role name implies.

    Step 1: list every contributor, then settle the author line

    Start with people, not roles. Write down everyone who contributed to the work in any way, including those who may end up acknowledged rather than authored. Then apply your field’s authorship test to decide who belongs on the author line. In biomedical research that is the ICMJE four criteria: substantial contribution to conception or design or to acquisition, analysis or interpretation of data; drafting or critically revising the work; final approval of the version to be published; and accountability for the work. Most publishers apply CRediT only to named authors, so settle the author line first.

    Step 2: assign roles to each named author against the canonical definitions

    Take each author in turn and ask, for each of the fourteen roles, whether they genuinely performed that contribution as the definition describes it. Some pointers that prevent common mistakes:

    • Investigation is performing the experiments or collecting the data — not the same as Data curation, which is annotating, cleaning, and maintaining the data for reuse.
    • Methodology is designing or developing the method; Software is writing the code that implements it. In some fields these overlap, but assign both only where both genuinely happened.
    • Writing – original draft is preparing the initial draft; Writing – review & editing is critical revision by members of the original research group. An author who only commented on a near-final draft did the latter, not the former.
    • Funding acquisition, Resources, and Supervision are legitimate roles, but on their own they may or may not meet the authorship bar in your field — record the contribution honestly and let the authorship test, not the role, decide.

    Be aware that even careful researchers given the same description sometimes disagree on which roles apply; the boundaries between adjacent roles are genuinely fuzzy. Treat the statement as an honest broad signal, not a precise measurement.

    Step 3: add the degree-of-contribution qualifier where it helps

    The standard supports an optional qualifier on each assignment — lead, equal, or supporting. It is not a percentage and it does not rank roles against one another; it distinguishes “I led this” from “I contributed to this.” Most published statements omit it because few publishers require it, but it is genuinely useful where several authors share a role: marking one author as lead on Writing – original draft and two as supporting conveys real information at almost no cost.

    Step 4: confirm with every author

    A contribution statement is a claim made on behalf of named people, so each named person should see and confirm their own roles before submission. This is not bureaucratic box-ticking. It is the step that catches the case where one author has, in good faith, attributed to themselves work that another person actually did — the failure mode that an honest taxonomy exists to surface. Circulate the draft statement; let each author correct their own line.

    Step 5: format it for the journal

    The conventional written form lists each author by name followed by their roles:

    Zhang San: Conceptualization, Methodology, Software. Priya Patel: Data curation, Writing – original draft. Erin Wright: Visualization, Investigation. Adam Lloyd: Supervision, Software, Validation. Maria García-López: Writing – review & editing.

    Many submission systems collect the same information through a structured form instead, which is better: a statement captured as structured metadata can propagate to Crossref and ORCID and be read by downstream systems, whereas a closing paragraph of prose cannot. Where the journal offers the structured route, use it. Where it only collects a narrative paragraph, write the paragraph above — but know that its value as machine-readable data is limited until the publisher’s plumbing catches up.

    Common pitfalls to avoid

    • Treating CRediT as the authorship test. It records contribution; it does not decide who qualifies as an author.
    • Claiming roles for acknowledged work. If a medical writer drafted the text or a technician ran the experiments, do not absorb their roles into an author’s line.
    • Over-assigning. Listing all fourteen roles for the senior author signals nothing. Assign only what was genuinely done.
    • Leaving author order to CRediT. CRediT does not encode author order; that is a separate decision your field’s conventions govern.

    Where shared vocabulary fits

    CRediT itself is settled; the live problem is that its real-world implementation is uneven, with many venues collecting only narrative paragraphs rather than structured metadata. A shared, federated vocabulary that defines the roles consistently and points back to NISO for the standard is what lets a statement written for one system mean the same thing when read by another. Supplying that definitional layer is the role the CASRAI dictionary is designed to play; the contributor-roles vocabulary sits in the CRediT extensions domain.

    Related reading

  • CRediT in JATS XML: a technical primer for production teams

    A contributor-roles statement is only as useful as it is machine-readable. A typesetter can render ‘A.B. wrote the original draft; C.D. supervised’ as a tidy paragraph at the foot of an article, but if that information lives only in prose then no downstream system — a research information system, an indexer, a funder’s reporting tool — can act on it. The point of CRediT, the Contributor Roles Taxonomy, is to make contributions structured, and in scholarly publishing ‘structured’ means encoded in JATS XML. This primer is for the production teams who actually do that encoding: the people for whom ‘add CRediT’ on a project plan turns into concrete decisions about elements, attributes and controlled vocabularies. The authoritative tag-level guidance is set out in the CRediT in JATS reference and the broader JATS implementation notes.

    Where contributor roles live in JATS

    JATS (the Journal Article Tag Suite, the NISO Z39.96 standard) models people in the <contrib-group> element. Each named individual is a <contrib>, carrying their name, affiliations and identifiers. The element that carries a contributor’s function is <role>, nested inside the relevant <contrib>. A single contributor may hold several roles, so multiple <role> elements per <contrib> are expected and entirely valid — one person might legitimately be tagged for Conceptualization, Methodology and Writing – review & editing.

    The job of a production team is to make those <role> elements unambiguous. Free-text role labels are not enough, because ‘wrote the paper’ and ‘drafting’ and ‘Writing – original draft’ are the same role expressed three ways. CRediT solves this by giving each of its roles a stable definition and a canonical identifier, and JATS provides the attributes to point at them.

    The JATS4R recommendation for encoding CRediT

    JATS4R — JATS for Reuse — is the community group that publishes interoperability recommendations for ambiguous corners of the standard, and it has a specific recommendation for CRediT. The core of it is that a <role> element used for a CRediT contribution should declare the vocabulary it draws from and the specific term within it. In practice this means three attributes work together:

    • vocab — identifies the controlled vocabulary as CRediT;
    • vocab-identifier — gives the URI of the taxonomy itself, so a consuming system can resolve what vocabulary is being used;
    • vocab-term and vocab-term-identifier — give the exact term and its canonical URI, so the role resolves to one and only one CRediT definition.

    The human-readable label remains the text content of the <role> element — that is what a reader sees — while the attributes carry the machine meaning. The recommendation is deliberate that the visible text and the term identifier must agree: do not tag a <role> as Data curation in its attributes while the visible text reads ‘Formal analysis’. JATS4R also advises using the official CRediT term strings verbatim rather than house variants, because verbatim strings are what validators and aggregators expect to match.

    Degrees of contribution

    CRediT permits, but does not require, a statement of the degree of a contribution — for example marking one contributor as having led a given role. JATS expresses this through additional attribution on the role rather than by changing the term identifier. Production teams should treat degree as optional metadata that is encoded only when the manuscript actually supplies it; inventing a lead/equal distinction where the authors stated none is a data-quality error, not an enhancement. When degree information is present, keep it consistent across the article so that a reader and a parser draw the same conclusion.

    Common production pitfalls

    Several mistakes recur often enough to be worth naming. The first is putting CRediT roles in the wrong place — bundling them into an unstructured author-contributions paragraph in the article body instead of, or in addition to, the structured <role> elements. The structured encoding is the one machines read; a prose paragraph is a courtesy to humans, not a substitute. The second is omitting vocab-identifier and vocab-term-identifier, which leaves the role as plain text that cannot be reliably disambiguated. The third is term drift: lightly edited labels such as ‘Writing (review and editing)’ that no longer match the canonical CRediT string and therefore fail automated checks.

    A subtler issue is association: every <role> must sit inside the correct <contrib>. In articles with long author lists it is easy for a role to be attached to the wrong person during conversion, especially when contributions are supplied as a separate table that a typesetter merges by hand. Validating that each role resolves to the intended contributor is as important as validating that the term identifiers are correct.

    Building it into the workflow

    The practical recommendation is to capture CRediT as structured data as early as possible — ideally at submission, where many manuscript systems now collect a contribution matrix — and to carry that structure through conversion rather than reconstructing it from prose at the typesetting stage. Round-trip validation against the JATS4R recommendation should be part of the production QA step, alongside the schema validation a publisher already runs. Treating contributor roles as first-class structured metadata, governed by the definitions in the research information systems domain of the CASRAI Dictionary, is what allows contribution data to survive intact all the way to the version of record and beyond.

  • Crediting contributions in systematic reviews and meta-analyses

    A systematic review looks, from the outside, like a single coherent document with a tidy list of authors. From the inside it is a small project with a remarkable division of labour: a protocol to register, a search strategy to design and run across multiple databases, thousands of records to screen against eligibility criteria, full texts to retrieve and assess, data to extract twice over, risk-of-bias judgements to make, a synthesis or meta-analysis to compute, and a report to write to an exacting standard. Each of those tasks is a distinct skill, and each is usually done by a different person or pair of people. The conventional author byline flattens all of it. This article looks at how structured reporting through PRISMA and structured contributorship through the CRediT taxonomy together make the real shape of this work visible, and where the vocabulary for it sits in the credit extensions domain of the CASRAI Dictionary.

    Why a review is hard to credit fairly

    The difficulty is that the most laborious and methodologically critical parts of a review are precisely the ones that leave no trace in a traditional byline. Screening twenty thousand abstracts in duplicate is exacting, consequential work — get the eligibility judgements wrong and the whole review is compromised — yet it is invisible in author order. The same is true of designing a reproducible search, performing duplicate data extraction, or making risk-of-bias assessments. Meanwhile, the person who conceived the question and the person who drafted the manuscript are easy to recognise. A fair account of a review has to name the unglamorous, high-stakes tasks as clearly as the visible ones.

    PRISMA: reporting the process transparently

    The first half of the answer is methodological transparency. PRISMA — Preferred Reporting Items for Systematic Reviews and Meta-Analyses — is the reporting guideline that tells readers what a review actually did: how the search was constructed, how records moved from identification through screening to inclusion (the familiar flow diagram), how data were extracted, and how studies were appraised and synthesised. PRISMA does not assign credit, but it makes the work auditable. When a review reports its process to the PRISMA standard, the existence and scale of each task — the searching, the screening, the extraction, the appraisal — becomes explicit rather than implied. That visibility is the precondition for crediting it: you cannot recognise a contribution that the reporting has hidden.

    CRediT: naming who did what

    The second half is contributorship. The Contributor Roles Taxonomy provides a controlled vocabulary of contribution types that maps unusually well onto the anatomy of a review. The full set is set out in our overview of the CRediT roles, but several are worth singling out for evidence synthesis:

    • Conceptualization — formulating the review question and eligibility criteria.
    • Methodology — designing the search strategy and the synthesis approach, often the work of an information specialist.
    • Investigation — running the searches, screening records and retrieving full texts.
    • Data curation — managing the extracted data, de-duplication and the records that underpin the flow diagram.
    • Formal analysis — the meta-analysis itself, including heterogeneity assessment and any sensitivity analyses.
    • Writing – original draft and Writing – review & editing — producing and refining the manuscript.

    Used together, these roles let a review record that the information specialist designed the search, that two named reviewers screened and extracted in duplicate, and that the statistician ran the synthesis — rather than leaving all of it to be guessed from author order. The wider CRediT taxonomy turns the division of labour into a machine-readable statement attached to the output.

    The role information specialists deserve

    One contribution that systematic reviews chronically under-credit is that of the information specialist or research librarian who designs and validates the search. A poorly constructed search undermines a review more surely than almost any other flaw, and a well-constructed one is a genuine methodological achievement. Recording this work explicitly under Methodology and Investigation — rather than relegating it to an acknowledgement — is one of the clearest practical gains from applying contributorship to evidence synthesis. It names a contribution that is both critical and routinely invisible.

    Crediting duplicate work without double-counting

    Reviews rely on tasks done independently by two people — duplicate screening, duplicate extraction — precisely to reduce error. Contributorship should reflect that both reviewers did the work, which CRediT handles naturally by allowing a role to be assigned to more than one contributor. The honest principle, as ever, is that a role records what a person actually did: both screeners earn the Investigation role because both genuinely screened, not as a courtesy. This is the same standard that applies across all contribution recording — credit follows real work, and is neither inflated for visibility nor withheld for convenience.

    A consistent record across systems

    Systematic reviews increasingly register protocols, deposit search strategies and data, and publish in journals that require both PRISMA reporting and a contributorship statement. For that ecosystem to work, the way a contribution is described has to mean the same thing wherever it appears. That consistency is what the CASRAI Dictionary exists to provide: a stable vocabulary so that a Methodology contribution declared in a protocol registry, a manuscript and an institutional record can be recognised as the same claim. Combined with PRISMA’s transparency about process, structured contribution makes the substantial, distributed work of evidence synthesis legible — crediting the screeners, extractors and search designers whose labour holds a review together, not only the names at the top of the list.