Tag: preprint definition

  • What Counts as a Preprint? A 2026 Glossary for Research Offices

    Ask five researchers what a preprint is and you may get five slightly different answers — a working paper, a draft awaiting peer review, “the version on arXiv,” or simply “not the real paper yet.” For research offices drafting policy language on open access compliance, data management plans, and research assessment submissions, that ambiguity is a liability. The preprint meaning has become precise enough in standards documentation — from NISO, DataCite, and the Committee on Publication Ethics (COPE) — that institutions no longer need to guess. This glossary sets out the terminology research administrators need to write policy that survives an audit, a funder query, or a REF 2029 submission check.

    The stakes are not academic. UKRI’s open access policy, cOAlition S’s Plan S implementation guidance, and NIH’s data management and sharing policy all reference preprints, postprints, and versions of record as distinct objects with different compliance implications. Get the terminology wrong in an institutional policy and you risk researchers depositing the wrong version, funders rejecting compliance claims, or citation records fragmenting across multiple, unlinked copies of the same work. As open science mandates strengthen globally, a shared, standards-based vocabulary is no longer optional background reading — it is operational infrastructure.

    Preprint Meaning: A Working Definition for Research Offices

    At its simplest, a preprint is a complete draft of a research manuscript that has not yet undergone formal peer review, made publicly available — typically via a dedicated preprint server — before or independently of journal submission. The preprint definition used across NISO’s Journal Article Versions (JAV) recommended practice (NISO RP-8-2008) and subsequent guidance treats the preprint as the earliest stable, citable version in the manuscript lifecycle: the author’s own work, formatted for sharing, but not yet vetted by editors or reviewers.

    Three features distinguish a preprint from other manuscript states:

    • No peer review has occurred. The content reflects the author’s claims and methodology as submitted, unmediated by editorial or reviewer intervention.
    • It is deposited on a recognised preprint server — arXiv, bioRxiv, medRxiv, SSRN, and discipline-specific repositories are the most established examples — which assigns a persistent identifier, typically a DOI, and a deposit timestamp.
    • It establishes priority and openness simultaneously. The timestamp on a preprint server can serve as evidence of when a finding was first disclosed, independent of the (often much later) journal publication date.

    COPE’s guidance on preprints is explicit that preprints are a legitimate part of scholarly communication, not a lesser or informal category — but it also requires that journals and authors disclose preprint status clearly, and that editors have policies for how prior preprint posting interacts with subsequent peer review and publication.

    From Preprint to Postprint to Version of Record: The Publication Lifecycle

    Confusion most often arises between “preprint” and “postprint,” two terms that sound similar but describe opposite ends of the peer-review process. A postprint (sometimes called an “accepted manuscript” or “author accepted manuscript,” AAM) is the version of a paper that has passed peer review and been accepted for publication, but has not yet had the publisher’s copy-editing, typesetting, and formatting applied. This is frequently the version institutional repositories are permitted to hold under green open access routes, because publisher agreements typically restrict redistribution of the final typeset article while permitting the author’s accepted manuscript to be shared, often after an embargo.

    The version of record (VoR) is the final, publisher-formatted, definitively citable version — the one that carries the journal’s pagination, DOI resolution to the publisher platform, and any post-publication corrections or retraction notices. NISO’s JAV framework identifies additional intermediate states (the “proof” stage, and “corrected version of record” where errata have been applied), but for institutional policy purposes, the three-stage distinction — preprint, postprint, version of record — covers the overwhelming majority of compliance scenarios research offices encounter.

    This matters practically. A funder mandate that requires deposit of the “accepted manuscript” within a defined window is asking for the postprint, not the preprint and not the VoR. Conflating the three in institutional guidance produces non-compliant deposits, embargo miscalculations, and researcher confusion at the exact moment administrators most need clarity.

    How DataCite, COPE, and NISO Define These Terms

    Because preprints now carry persistent identifiers and are cited independently, metadata standards bodies have had to formalise their treatment. DataCite’s metadata schema includes “Preprint” as a distinct resourceTypeGeneral value, and its relation-type vocabulary (IsPreprintOf / HasPreprint) allows a preprint DOI to be explicitly linked to the eventual journal article DOI once one exists. This linkage is what allows citation tracking, repository dashboards, and research information systems to recognise that two DOIs represent the same underlying work at different lifecycle stages, rather than treating them as unrelated outputs — a distinction that matters directly for accurate publication counts in REF-style assessment exercises and for avoiding duplicate-record inflation in CRIS platforms.

    CrossRef performs a parallel function for journal-affiliated preprint servers, registering preprint DOIs and supporting the same relation-type linking so that a reader arriving at a preprint can be pointed to the published version once it exists, and vice versa.

    NISO’s contribution is primarily the version taxonomy described above (JAV), plus its broader work on persistent identifiers and metadata interoperability, which underpins how systems like ORCID reliably attribute both a preprint and its later published version to the correct author record — increasingly important as ORCID iDs become a near-universal requirement across funder and publisher submission systems.

    COPE’s role is ethical and procedural rather than technical: its guidance addresses how editors should handle papers that were previously posted as preprints, how to manage cases where a preprint is later found to contain errors or misconduct, and how licensing on preprint servers should be disclosed to avoid conflicts with subsequent publisher copyright agreements. Read together, DataCite and CrossRef provide the identifier and metadata plumbing, NISO provides the version vocabulary, and COPE provides the editorial ethics — three complementary layers a single institutional glossary needs to reflect accurately.

    Preprint Servers and the Policy Questions They Raise

    The proliferation of preprint servers — general (SSRN), disciplinary (arXiv, bioRxiv, medRxiv), and increasingly institutional — raises questions research offices are now expected to answer in policy: Which preprint servers does the institution recognise for compliance purposes? Does depositing on a preprint server satisfy a funder’s “immediate open access” requirement, or only the “accepted manuscript” requirement? How should preprints be represented in promotion and tenure dossiers, and how should reviewers weigh work that has not yet been peer reviewed?

    UKRI’s open access policy and cOAlition S’s Plan S both give explicit standing to preprints as a compliance route in specific circumstances, while NIH’s now-enforced data sharing and public access policies require institutions to track which version of a manuscript satisfies which obligation. Ambiguity in local guidance forces researchers to interpret funder rules themselves — inconsistently, and at institutional risk.

    What This Means for Research Administrators

    Precise terminology is not a semantic nicety; it is the basis of enforceable, auditable policy. Research offices should:

    • Adopt the preprint / postprint / version-of-record distinction as standard language across open access policy, repository deposit guidance, and researcher-facing FAQs — rather than each unit inventing its own phrasing.
    • Reference DataCite’s and CrossRef’s relation-type linking when advising on how preprints and their published counterparts should be connected in institutional repositories and CRIS systems, to avoid duplicate or orphaned records.
    • Align embargo and compliance guidance to the correct manuscript version specified by each funder — the accepted manuscript (postprint) in most green open access mandates, not the preprint.
    • Build preprint-awareness into research integrity training, reflecting COPE’s guidance on disclosure and editorial handling of previously posted work.
    • Ensure ORCID records and institutional profiles capture preprints as distinct, linked outputs rather than omitting them or conflating them with the eventual journal article.

    As open science practice matures and preprints move from niche practice to mainstream infrastructure across disciplines, the institutions with the clearest internal vocabulary will be the ones best positioned to answer funder audits, support accurate research assessment submissions, and give researchers confidence that sharing early is compatible with getting credit later. The terminology already exists in standards documentation from NISO, DataCite, and COPE — the task for research administrators in 2026 is simply to adopt it consistently.