Tag: EQUATOR

  • Reproducibility frameworks in practice: TOP, ARRIVE, CONSORT, PRISMA

    The reporting-guideline ecosystem has grown to nearly 600 distinct guidelines tracked by the EQUATOR Network. For an author or editor staring at this in 2026, the question is not which guideline to applaud but which to actually use, when, and at what depth. This post walks through the four frameworks that anchor the field, the FAIR4RS guidelines for research software, and the registered-report turn that is reshaping pre-publication reproducibility commitments.

    The four anchors

    TOP Guidelines

    The Transparency and Openness Promotion (TOP) Guidelines, developed at the Center for Open Science by Brian Nosek and colleagues, are the journal-policy framework rather than the per-paper checklist. TOP defines eight standards (citation, data transparency, analytic methods transparency, research materials transparency, design and analysis transparency, study preregistration, analysis-plan preregistration, replication) and three levels of stringency at which a journal can adopt each. A journal signing onto TOP commits to a profile of standard-by-standard adoption.

    TOP’s contribution is structural: it gave editors a vocabulary to discuss reproducibility policies and a benchmark against which their journals could be assessed. By 2026 the TOP Factor (a score of journals’ policies against the TOP standards) is widely used to compare journal reproducibility commitments, alongside the more famous and less informative Journal Impact Factor. The CASRAI reproducibility standards page tracks the current TOP adoption ledger.

    ARRIVE 2.0

    The ARRIVE guidelines (Animal Research: Reporting of In Vivo Experiments), revised in 2020 from the original 2010 version, are the canonical reporting standard for animal research. ARRIVE 2.0 introduced the Essential 10 (the must-report items) and the Recommended Set (the should-report items), which made the guideline more usable for both authors and reviewers.

    ARRIVE adoption in 2026 is high in funder mandates (NIH, MRC, NC3Rs) but uneven in journal enforcement. The retrospective audits keep finding that even ARRIVE-required papers miss core items (randomisation method, blinding, sample-size justification). The lesson is that requiring a guideline at submission is not the same as enforcing it at peer review.

    CONSORT 2010 and its extensions

    The CONSORT 2010 statement is the reporting standard for randomised controlled trials and the most-cited reporting guideline in scholarly publishing. A CONSORT-compliant RCT report covers the title and abstract, methods (design, participants, interventions, outcomes, sample size, randomisation, blinding, statistical methods), results (participant flow, baseline data, primary and secondary outcomes, ancillary analyses, harms), and discussion (limitations, generalisability, interpretation). The CONSORT flow diagram (enrolled, allocated, followed-up, analysed) is itself a reportability artefact that has done more for trial transparency than most policy documents.

    The 2025 revision of CONSORT (CONSORT 2025) is being finalised and is expected to integrate explicit reporting requirements for adaptive trial designs, machine-learning-derived endpoints, and patient-public involvement. The current standard is 2010 with several extensions (Cluster, Pragmatic, Non-pharmacological, Harms, Patient-reported outcomes, Outcomes, AI). Authors of any RCT should consult the relevant extension as well as the core standard.

    PRISMA 2020

    The PRISMA 2020 statement is the reporting standard for systematic reviews and meta-analyses. The 2020 revision modernised the 2009 original to reflect changes in search-and-screening practice (preprint searches, GitHub/OSF searches, ML-assisted screening), risk-of-bias assessment (ROB 2 for trials, ROBINS-I for non-randomised studies, AMSTAR-2 for review quality), and reporting formats (the PRISMA-S extension for search reporting, PRISMA-NMA for network meta-analyses).

    PRISMA’s role in the systematic-review economy is dispositive: journals routinely refuse review submissions that do not include a PRISMA flow diagram and checklist. The remaining failure mode is checklist-completion-without-substance, where a paper ticks the boxes but the underlying review work is shallow.

    Why these four anchors and not others

    The four cover the bulk of submission volume in clinical and life-science journals: RCTs (CONSORT), systematic reviews (PRISMA), animal studies (ARRIVE), and the meta-question of journal policy (TOP). For observational studies, STROBE is the analogue of CONSORT; for diagnostic accuracy studies, STARD; for case reports, CARE; for qualitative research, SRQR or COREQ; for AI-clinical-prediction models, TRIPOD-AI and PROBAST-AI. The EQUATOR Network’s searchable database remains the canonical entry point.

    Computational reproducibility

    The reporting-guideline tradition was built around clinical and life-science studies. Computational reproducibility (your code, your data, your dependencies, run on your computer, gives the same answer) was historically not in scope and is now belatedly the focus of much of the methodological community’s attention.

    The 2024-2025 convergence is around three pillars. First, data deposition in a FAIR-compliant repository with a DOI, with explicit licensing. Second, code deposition with a DOI (typically via Zenodo with a Git-tagged release), with explicit dependencies (environment files, container image hashes, or both). Third, computational environment via container (Docker, Singularity/Apptainer), or via a more lightweight pinned manifest (R’s renv, Python’s pip-tools, Julia’s Project.toml).

    The FAIR4RS Principles, finalised by the RDA working group in 2022 and now widely cited, extend the FAIR data principles to research software. Software should be Findable (DOI, descriptive metadata), Accessible (open repository where possible), Interoperable (using standards), and Reusable (with a clear licence, documentation, and provenance). FAIR4RS is being integrated into funder data-management-plan requirements in 2026; the UK’s UKRI, the EU’s HORIZON Europe, and several US funders now ask for software-management plans as a distinct artefact from data-management plans.

    Pre-registration and registered reports

    Preregistration (committing to your hypotheses and analysis plan before seeing the data) has moved from a niche reproducibility-community practice to a mainstream expectation in psychology, parts of medicine, and increasingly in economics and political science. The Center for Open Science’s preregistration tools have crossed 200,000 registered studies; ClinicalTrials.gov and the WHO ICTRP carry the trial register.

    The more interesting development is Registered Reports, a journal format in which a study protocol is peer-reviewed before data collection. If accepted at this Stage 1 review, the journal commits to publishing the Stage 2 manuscript regardless of whether the results are positive, negative, or null. Over 300 journals offer Registered Reports as of 2026, including several major medical journals. The empirical evidence is clear: Registered Reports show much lower positive-results rates than conventional submissions in the same fields, consistent with what we would expect if the conventional system suffers from publication bias.

    How to use this in practice

    For an author submitting a paper, the workflow is:

    1. Identify your study design and find the matching EQUATOR-listed reporting guideline (or guidelines, if multiple apply, e.g., a cluster RCT might use CONSORT plus the Cluster extension).
    2. Use the guideline’s checklist while drafting, not as a checkbox exercise at submission. The checklists are designed to prompt completeness.
    3. For computational components, deposit data and code with DOIs, declare dependencies, and consider a container if your environment is non-trivial.
    4. If your design supports it, consider preregistration or a Registered Report. The discipline of pre-specifying is itself the reproducibility intervention; the registration is the audit trail.
    5. In the methods, explicitly cite the guideline(s) you followed. Cite the deposited data and code with their DOIs in the references, not just in a parenthetical.

    Where this all goes

    The next wave of reporting-guideline work is around AI-clinical-prediction reporting (TRIPOD-AI, finalised in 2024; CLAIM for AI imaging studies), real-world-evidence studies (RECORD-PE, STaRT-RWE), and qualitative-meta-synthesis (ENTREQ). The structural question is whether the proliferation is helping or hurting. We think the answer is that the per-method guidelines are valuable but the cross-cutting transparency standards (TOP, FAIR, FAIR4RS, the registered-report meta-format) are doing the heavier lifting. Editors who pick a TOP profile and enforce it across submissions get more reproducibility uplift than editors who require a guideline checklist and then ignore the contents.

    Related dictionary entries

    References

    EQUATOR Network, Reporting Guidelines for Health Research (continuously updated). Nosek et al., Promoting an open research culture (Science, 2015, introducing TOP). Page et al., The PRISMA 2020 statement (BMJ, 2021). Percie du Sert et al., The ARRIVE guidelines 2.0 (PLOS Biology, 2020). Chambers, The Seven Deadly Sins of Psychology (Princeton, 2017, on Registered Reports). RDA FAIR4RS Working Group, FAIR Principles for Research Software (2022).

  • PRISMA 2026: the next-generation systematic-review reporting standard

    The PRISMA statement has been the dominant reporting standard for systematic reviews and meta-analyses since 2009, with its most recent major revision in 2020. The 2026 update, drafted through 2024 and 2025 and finalised at the end of 2025, adds machine-readability, structured handling of AI-assisted screening, and explicit support for living systematic reviews. This post walks through what changed, why it matters, and what reviewers and journals should do to update their practices.

    What PRISMA 2020 left unresolved

    PRISMA 2020 added much-needed clarity to several persistent ambiguities — the role of registries and protocols, transparent reporting of search strategies, structured presentation of risk-of-bias assessment — but it left several gaps that became more pressing through 2021-2024.

    First, AI-assisted screening. By 2023, a substantial fraction of new systematic reviews used machine-learning tools for title-and-abstract screening (Abstrackr, Rayyan’s ML mode, Covidence’s automation, Distiller’s classification, bespoke models). PRISMA 2020 had no place to report this; reviewers either omitted it, mentioned it in passing, or invented their own reporting conventions. The result was a reproducibility gap: a reader could not tell whether a review had used AI to filter studies, what the AI’s parameters were, or how human checking was integrated.

    Second, living reviews. The conventional systematic review is a snapshot: search to a date, screen, extract, synthesise, publish. A living systematic review is continuously updated as new evidence emerges. PRISMA 2020’s reporting conventions assumed a snapshot model; reviewers running living reviews had to adapt the checklist by hand.

    Third, structured machine-readability. PRISMA 2020 specified what to report but not how to deposit it in structured form. The result was that systematic-review metadata lived as free text in PDFs, unreachable by tools that wanted to aggregate methodological features across reviews.

    What PRISMA 2026 changes

    The 2026 revision is layered: the 27-item core checklist remains, with three items extended and four new items added. The extensions are backward-compatible — a review that satisfies PRISMA 2020 also satisfies the unchanged items of PRISMA 2026 — and the new items are clearly flagged. The full statement is being published in the usual cluster of journals (BMJ, PLOS Medicine, Journal of Clinical Epidemiology, Systematic Reviews) with simultaneous open-access release.

    The AI-screening item

    The new item 8b requires reviewers who used AI or machine-learning tools in study identification, screening, or data extraction to report: the tool used (name and version), its training data or pre-training source, the threshold for AI-flagged inclusion versus human review, the human-checking strategy (full re-screen, sample re-screen, only AI-rejected items), and the integration into the overall workflow with quantitative reporting of agreement rates.

    This is non-trivial reporting and will catch many reviews unprepared. The recommendation from the working group is that AI-screening parameters should be set in the protocol (registered on PROSPERO or an equivalent registry) before screening begins, and that the reporting follow the protocol. A review that decides post hoc to use AI screening without protocol support is on weaker ground for both methodology and reporting.

    The living-review checklist

    PRISMA 2026 adds a parallel reporting checklist for living systematic reviews: items covering the update frequency, the trigger for re-running the search, the handling of new evidence that changes pooled estimates, and the versioning of the published review. The checklist is meant to be applied at each update, with structured logging of what changed between versions.

    For journals publishing living reviews, the implication is that they need an editorial process that supports versioned publication. The BMJ, Cochrane Library, and several others have living-review streams; many others do not, and PRISMA 2026’s existence will push more journals toward supporting the format.

    The machine-readable flow diagram

    The PRISMA flow diagram has been the visual centrepiece of every systematic review since 2009. PRISMA 2026 introduces a structured JSON representation alongside the visual diagram, with the diagram regeneratable from the JSON. The JSON captures: records identified per source, records duplicate-removed, records screened, records excluded with reasons categorised, reports retrieved, reports assessed for eligibility, reports excluded with reasons categorised, studies included, reports of those studies included.

    The structured format means a reader (or a tool) can query the flow programmatically. The intended downstream uses include automated meta-research, evidence-synthesis platforms ingesting reviews at scale, and the construction of multi-review evidence maps from machine-readable inputs. The CASRAI reproducibility domain has begun cataloguing the JSON schema.

    What journals should do

    For journals publishing systematic reviews, three updates are needed. First, update the submission template to ask for PRISMA 2026 compliance (the working group has issued model wording). Second, require deposit of the machine-readable flow diagram JSON alongside the PDF; the BMJ has pioneered this and the model is straightforward. Third, accept registered living-review submissions with a path to versioned publication, even if the current editorial workflow assumes single publication.

    What reviewers should do

    For systematic reviewers, the practical changes are: include PRISMA 2026 compliance in your protocol and pre-register it; if you use AI screening, plan the reporting against item 8b from the outset; produce the flow-diagram JSON during the review (most modern reference-management tools will export it) rather than reconstructing it at write-up; if your review is intended to be living, declare so in the protocol with the update strategy specified.

    EQUATOR Network alignment

    PRISMA 2026 has been developed in close coordination with the EQUATOR Network and is the first major EQUATOR-listed reporting guideline to include both AI-assisted research conduct and machine-readable structured outputs. The expectation is that other EQUATOR guidelines (CONSORT, STROBE, ARRIVE) will follow similar patterns in their next revisions.

    Areas of ongoing debate

    Two questions in the PRISMA 2026 development process were not closed and deserve continued attention. First, the threshold for AI use that triggers item 8b. The current language is “any use of AI or machine learning in study identification, screening, or data extraction.” Some reviewers argued for a higher threshold — only deep-learning-based tools with non-trivial filtering thresholds — while others argued for a lower one — any automation including deduplication. The published version errs toward broader disclosure.

    Second, the scope of the structured output. The flow-diagram JSON is the first machine-readable PRISMA item, but the same logic could apply to the risk-of-bias assessment, the data-extraction sheet, and the synthesis. The working group elected to start small with the flow diagram and expand in future revisions.

    For the CASRAI community, the takeaway is that systematic-review reporting is moving in the direction we have argued for: structured, machine-readable, integrated with the PID infrastructure (review DOIs, protocol DOIs, dataset DOIs), and explicit about modern tooling. The remaining gaps are tractable. PRISMA 2026 is a substantial step.

    Related dictionary entries