Tag: FAIR4RS

  • How the Software role applies to code-only outputs

    A growing fraction of research output is code: software libraries that implement a method, computational notebooks that demonstrate an analysis, simulation frameworks that enable a body of work, infrastructure tooling that supports a research community. When the output is primarily code, the CRediT Software role carries weight that the role’s brief definition does not fully prepare it for. This post is a practical guide to assigning Software in code-centric contexts.

    The Software role, briefly

    The CRediT Software role is defined as: Programming, software development; designing computer programs; implementation of the computer code and supporting algorithms; testing of existing code components. The definition is short and was written with software-as-tool-for-a-paper in mind, not software-as-the-paper.

    For a conventional research paper where someone wrote analysis code that supported the science, Software is straightforward: the person who wrote the analysis code gets the role. For a paper whose primary scholarly contribution is the code itself — a JOSS paper, a software-methods paper, a tool announcement — Software is the dominant role and the brevity of the definition starts to bite.

    What the Software role should cover in a code-only context

    Our recommendation, distilled from the practice of JOSS, the Software Sustainability Institute, the Research Software Engineers community, and several years of CASRAI editorial work, is to read Software in code-only contexts as encompassing the following five sub-activities, all of which should be visible in the contributorship statement even if they share the role.

    Implementation: writing the production code itself. This is the core of Software and is what people most naturally associate with the role.

    Architecture and design: the higher-level decisions about how the code is structured, what its dependencies are, how its modules interact. In a code-only paper, architecture is part of the intellectual contribution and the architect should be a co-author with Software role.

    Testing: writing the test suite, including unit tests, integration tests, and regression tests. A code-only paper with a credible test suite has someone who built it.

    Documentation: user-facing documentation, developer-facing documentation, README, examples, tutorials. For code intended for reuse, documentation is part of the deliverable; the documentation contributor gets the Software role.

    Packaging and release: the engineering work of making the code installable, citable, and citation-resolvable. CI/CD configuration, dependency management, release-tagging, DOI registration. For long-lived code with multiple releases, this is sustained work; for a one-off code release accompanying a paper, it is still non-trivial.

    Each of these is meaningful contribution that the Software role captures. A code-only paper’s CRediT statement should make the distribution of these activities across contributors visible, using the lead/equal/supporting qualifier to express relative magnitude.

    Where Software overlaps with other roles

    Three overlaps deserve attention.

    First, Software versus Methodology. If the code implements a novel method, the method itself is a Methodology contribution; the implementation is a Software contribution. The same person often discharges both, and the contributorship statement should assign both roles to them. The error to avoid is conflating the two: assigning Software while omitting Methodology under-represents the intellectual contribution.

    Second, Software versus Validation. Writing tests is Software (per the definition); validating the code against reference implementations or independent data is Validation. The distinction is genuine: tests verify that the code does what the developer intended; validation verifies that the code does what is scientifically correct. Both belong in a code-only paper’s contributorship.

    Third, Software versus Writing – original draft. The README, the developer documentation, the API reference — these are documentation, captured under Software. The paper itself, including its method description and its discussion of design choices, is captured under Writing – original draft. The boundary is the publication artefact: anything in the paper is Writing; anything in the code repository is Software.

    Cross-referencing with CITATION.cff

    The CITATION.cff convention, increasingly standard in scientific software repositories, provides a richer contributor model than CRediT alone. CFF supports author, contact, and contributor entries with type-of-contribution fields; integrators have extended it with CRediT-aligned vocabularies. The recommended pattern for a code-only paper is to maintain both: a CRediT statement in the paper (for the paper-level contributorship) and a CITATION.cff in the repository (for the per-version, per-component contributorship that CRediT cannot express).

    The two should be consistent. A contributor named in the paper with Software role should appear in the CITATION.cff with at least equivalent contribution; a contributor named in the CITATION.cff but not in the paper should be acknowledged in the paper’s acknowledgements section. The CASRAI CITATION.cff entry walks through the integration patterns.

    The maintenance question

    An unresolved aspect of Software in code-only contexts is how to credit maintenance over time. A research software package may have a paper at first release, with a CRediT statement reflecting the founding contributors. Five years and several major versions later, the package has new maintainers, new contributors, and a substantially different code base. The original paper’s CRediT statement is increasingly out of date.

    The current pragmatic answer is: the paper’s CRediT statement freezes at publication; the CITATION.cff in the repository tracks current contributorship; downstream citation should reference both, with the paper as the publication-of-record and the CFF as the current-contributor record. This works but is imperfect. The Software Citation Working Group has been chewing on whether per-version CRediT statements, deposited to Crossref via the related-identifier mechanism, would be a cleaner answer; the proposal is technically viable but not yet a community consensus.

    What journals should do

    For journals publishing software papers, the recommended editorial practices are: require CRediT with qualifiers in the paper; require a CITATION.cff in the linked repository; verify that the two are consistent; for major software packages, accept and publish supplementary contributor records that go beyond the byline.

    JOSS is the maturity reference here and most other software-paper venues are moving toward similar practices. The CASRAI CRediT for software papers guide is updated quarterly with current practice.

    What authors should do

    For authors of code-only papers, four practical steps. First, distribute the Software role across the five sub-activities visibly, using the qualifier. Second, assign Methodology when the code implements a novel method. Third, maintain the CITATION.cff in the repository in parallel with the paper’s CRediT statement. Fourth, plan for the maintenance-credit question: who will maintain the code, how their contribution will be recognised over time, where the credit will live.

    The CRediT taxonomy can support code-only outputs well, with attention. The work is in using the Software role thoughtfully, in interlocking it with Methodology and Writing where appropriate, and in maintaining the parallel record in the repository.

    Related dictionary entries

  • Reproducibility frameworks in practice: TOP, ARRIVE, CONSORT, PRISMA

    The reporting-guideline ecosystem has grown to nearly 600 distinct guidelines tracked by the EQUATOR Network. For an author or editor staring at this in 2026, the question is not which guideline to applaud but which to actually use, when, and at what depth. This post walks through the four frameworks that anchor the field, the FAIR4RS guidelines for research software, and the registered-report turn that is reshaping pre-publication reproducibility commitments.

    The four anchors

    TOP Guidelines

    The Transparency and Openness Promotion (TOP) Guidelines, developed at the Center for Open Science by Brian Nosek and colleagues, are the journal-policy framework rather than the per-paper checklist. TOP defines eight standards (citation, data transparency, analytic methods transparency, research materials transparency, design and analysis transparency, study preregistration, analysis-plan preregistration, replication) and three levels of stringency at which a journal can adopt each. A journal signing onto TOP commits to a profile of standard-by-standard adoption.

    TOP’s contribution is structural: it gave editors a vocabulary to discuss reproducibility policies and a benchmark against which their journals could be assessed. By 2026 the TOP Factor (a score of journals’ policies against the TOP standards) is widely used to compare journal reproducibility commitments, alongside the more famous and less informative Journal Impact Factor. The CASRAI reproducibility standards page tracks the current TOP adoption ledger.

    ARRIVE 2.0

    The ARRIVE guidelines (Animal Research: Reporting of In Vivo Experiments), revised in 2020 from the original 2010 version, are the canonical reporting standard for animal research. ARRIVE 2.0 introduced the Essential 10 (the must-report items) and the Recommended Set (the should-report items), which made the guideline more usable for both authors and reviewers.

    ARRIVE adoption in 2026 is high in funder mandates (NIH, MRC, NC3Rs) but uneven in journal enforcement. The retrospective audits keep finding that even ARRIVE-required papers miss core items (randomisation method, blinding, sample-size justification). The lesson is that requiring a guideline at submission is not the same as enforcing it at peer review.

    CONSORT 2010 and its extensions

    The CONSORT 2010 statement is the reporting standard for randomised controlled trials and the most-cited reporting guideline in scholarly publishing. A CONSORT-compliant RCT report covers the title and abstract, methods (design, participants, interventions, outcomes, sample size, randomisation, blinding, statistical methods), results (participant flow, baseline data, primary and secondary outcomes, ancillary analyses, harms), and discussion (limitations, generalisability, interpretation). The CONSORT flow diagram (enrolled, allocated, followed-up, analysed) is itself a reportability artefact that has done more for trial transparency than most policy documents.

    The 2025 revision of CONSORT (CONSORT 2025) is being finalised and is expected to integrate explicit reporting requirements for adaptive trial designs, machine-learning-derived endpoints, and patient-public involvement. The current standard is 2010 with several extensions (Cluster, Pragmatic, Non-pharmacological, Harms, Patient-reported outcomes, Outcomes, AI). Authors of any RCT should consult the relevant extension as well as the core standard.

    PRISMA 2020

    The PRISMA 2020 statement is the reporting standard for systematic reviews and meta-analyses. The 2020 revision modernised the 2009 original to reflect changes in search-and-screening practice (preprint searches, GitHub/OSF searches, ML-assisted screening), risk-of-bias assessment (ROB 2 for trials, ROBINS-I for non-randomised studies, AMSTAR-2 for review quality), and reporting formats (the PRISMA-S extension for search reporting, PRISMA-NMA for network meta-analyses).

    PRISMA’s role in the systematic-review economy is dispositive: journals routinely refuse review submissions that do not include a PRISMA flow diagram and checklist. The remaining failure mode is checklist-completion-without-substance, where a paper ticks the boxes but the underlying review work is shallow.

    Why these four anchors and not others

    The four cover the bulk of submission volume in clinical and life-science journals: RCTs (CONSORT), systematic reviews (PRISMA), animal studies (ARRIVE), and the meta-question of journal policy (TOP). For observational studies, STROBE is the analogue of CONSORT; for diagnostic accuracy studies, STARD; for case reports, CARE; for qualitative research, SRQR or COREQ; for AI-clinical-prediction models, TRIPOD-AI and PROBAST-AI. The EQUATOR Network’s searchable database remains the canonical entry point.

    Computational reproducibility

    The reporting-guideline tradition was built around clinical and life-science studies. Computational reproducibility (your code, your data, your dependencies, run on your computer, gives the same answer) was historically not in scope and is now belatedly the focus of much of the methodological community’s attention.

    The 2024-2025 convergence is around three pillars. First, data deposition in a FAIR-compliant repository with a DOI, with explicit licensing. Second, code deposition with a DOI (typically via Zenodo with a Git-tagged release), with explicit dependencies (environment files, container image hashes, or both). Third, computational environment via container (Docker, Singularity/Apptainer), or via a more lightweight pinned manifest (R’s renv, Python’s pip-tools, Julia’s Project.toml).

    The FAIR4RS Principles, finalised by the RDA working group in 2022 and now widely cited, extend the FAIR data principles to research software. Software should be Findable (DOI, descriptive metadata), Accessible (open repository where possible), Interoperable (using standards), and Reusable (with a clear licence, documentation, and provenance). FAIR4RS is being integrated into funder data-management-plan requirements in 2026; the UK’s UKRI, the EU’s HORIZON Europe, and several US funders now ask for software-management plans as a distinct artefact from data-management plans.

    Pre-registration and registered reports

    Preregistration (committing to your hypotheses and analysis plan before seeing the data) has moved from a niche reproducibility-community practice to a mainstream expectation in psychology, parts of medicine, and increasingly in economics and political science. The Center for Open Science’s preregistration tools have crossed 200,000 registered studies; ClinicalTrials.gov and the WHO ICTRP carry the trial register.

    The more interesting development is Registered Reports, a journal format in which a study protocol is peer-reviewed before data collection. If accepted at this Stage 1 review, the journal commits to publishing the Stage 2 manuscript regardless of whether the results are positive, negative, or null. Over 300 journals offer Registered Reports as of 2026, including several major medical journals. The empirical evidence is clear: Registered Reports show much lower positive-results rates than conventional submissions in the same fields, consistent with what we would expect if the conventional system suffers from publication bias.

    How to use this in practice

    For an author submitting a paper, the workflow is:

    1. Identify your study design and find the matching EQUATOR-listed reporting guideline (or guidelines, if multiple apply, e.g., a cluster RCT might use CONSORT plus the Cluster extension).
    2. Use the guideline’s checklist while drafting, not as a checkbox exercise at submission. The checklists are designed to prompt completeness.
    3. For computational components, deposit data and code with DOIs, declare dependencies, and consider a container if your environment is non-trivial.
    4. If your design supports it, consider preregistration or a Registered Report. The discipline of pre-specifying is itself the reproducibility intervention; the registration is the audit trail.
    5. In the methods, explicitly cite the guideline(s) you followed. Cite the deposited data and code with their DOIs in the references, not just in a parenthetical.

    Where this all goes

    The next wave of reporting-guideline work is around AI-clinical-prediction reporting (TRIPOD-AI, finalised in 2024; CLAIM for AI imaging studies), real-world-evidence studies (RECORD-PE, STaRT-RWE), and qualitative-meta-synthesis (ENTREQ). The structural question is whether the proliferation is helping or hurting. We think the answer is that the per-method guidelines are valuable but the cross-cutting transparency standards (TOP, FAIR, FAIR4RS, the registered-report meta-format) are doing the heavier lifting. Editors who pick a TOP profile and enforce it across submissions get more reproducibility uplift than editors who require a guideline checklist and then ignore the contents.

    Related dictionary entries

    References

    EQUATOR Network, Reporting Guidelines for Health Research (continuously updated). Nosek et al., Promoting an open research culture (Science, 2015, introducing TOP). Page et al., The PRISMA 2020 statement (BMJ, 2021). Percie du Sert et al., The ARRIVE guidelines 2.0 (PLOS Biology, 2020). Chambers, The Seven Deadly Sins of Psychology (Princeton, 2017, on Registered Reports). RDA FAIR4RS Working Group, FAIR Principles for Research Software (2022).