Tag: openrxiv

  • BioRxiv PubMed Indexing: How the NIH Pilot Works

    BioRxiv PubMed indexing is not automatic. Preprints reach PubMed through a single federal mechanism — the NIH Preprint Pilot, run by the U.S. National Library of Medicine (NLM) — which pulls in preprints that acknowledge direct NIH funding or carry an NIH-affiliated author, provided they were posted from 1 January 2023 onward under the pilot’s current phase.

    The NIH Preprint Pilot is an NLM programme that makes NIH-funded preprints from eligible servers — bioRxiv, medRxiv, arXiv, and Research Square — discoverable through PubMed Central (PMC) and PubMed ahead of formal peer review, with a corresponding citation added on a weekly cycle.

    What is the NIH Preprint Pilot?

    The NIH Preprint Pilot began in June 2020 as a narrow, COVID-19-only initiative. NLM made more than 3,300 preprints reporting NIH-supported SARS-CoV-2 research discoverable in PMC and PubMed between June 2020 and June 2022, testing whether preprint records could accelerate discovery during a public-health emergency.

    Phase 2 launched on 30 January 2023 and dropped the COVID-only restriction. It now covers any preprint that acknowledges direct NIH support and/or lists an NIH-affiliated author, posted to an eligible server on or after 1 January 2023. Eligible preprints are added to PMC on a weekly basis and receive a corresponding PubMed citation automatically — authors do not submit anything separately.

    How a preprint moves from bioRxiv to PubMed

    The pipeline is largely invisible to authors and runs on a fixed weekly cadence. NLM does not wait for a submission; it identifies eligible content and pulls it in automatically, then layers PubMed on top of the PMC record.

    • Identification: NLM text-mines new bioRxiv and medRxiv postings for NIH-support acknowledgements and cross-checks the NIH Office of Portfolio Analysis tool for NIH-affiliated authors.
    • PMC ingestion: Citation and abstract metadata are pulled from the preprint server’s machine-readable feed to build an “article header” record, and a PMCID is assigned immediately to enable rapid discovery.
    • PubMed record creation: Once the PMC record exists, NLM generates the corresponding PubMed citation the same week, tagged with publication type “Preprint.”
    • Full-text conversion: Preprints posted under a Creative Commons licence enter a separate workflow to produce archival full-text XML, a process NLM says takes a few days and enables full-text search within PMC.

    Every record carries a prominent yellow information panel confirming the work has not been peer-reviewed, and NLM runs weekly checks — against the bioRxiv API, the Crossref API, and the Europe PMC API — to link a preprint to its eventual journal version, updating the PubMed status to “Updated” once that link is confirmed.

    Which preprint servers qualify

    Only four servers currently feed the pilot. NLM evaluates candidate servers against a published checklist — clear non-peer-review labelling, transparent versioning, open licensing information, machine-readable metadata, and a public archiving policy — modelled on NIH’s 2017 interim-research-products guidance (NOT-OD-17-050) and COPE’s preprint discussion document.

    Server Subject scope Operator DOI registration
    bioRxiv Life sciences openRxiv (independent nonprofit, formerly a Cold Spring Harbor Laboratory service) Crossref
    medRxiv Health and clinical sciences openRxiv, with Yale University and BMJ as founding partners Crossref
    arXiv Physics, mathematics, computer science, quantitative biology Cornell University Crossref
    Research Square Multidisciplinary Research Square Company Crossref

    bioRxiv and medRxiv are the two servers most relevant to biomedical research administrators, since both fall under openRxiv, the independent nonprofit that took over operation of both platforms from Cold Spring Harbor Laboratory. openRxiv’s separation from a single host institution was framed explicitly around long-term sustainability for the two servers NIH now indexes directly — a governance detail that matters for anyone assessing the pilot’s durability, since NLM’s own eligibility criteria require a “publicly stated archiving strategy to ensure long-term access.”

    What this means for discoverability, DOIs, and citation

    PubMed indexing changes where a preprint can be found, not whether it can be cited. Every bioRxiv preprint already receives a DOI registered through Crossref at posting, which is what makes it part of the citable scientific record regardless of NIH eligibility.

    According to bioRxiv’s own FAQ, preprints are indexed by “Google, all other search engines, Google Scholar, Crossref, Semantic Scholar, Europe PubMed Central, and Preprint Citation Index (connected to the Web of Science)” independent of the NIH pilot — PubMed indexing is an additional, funder-gated channel layered on top of that baseline discoverability.

    One clarification worth making explicitly: bioRxiv and medRxiv do not carry a Scimago Journal Rank or an impact factor. Both metrics are journal-level indicators computed from peer-reviewed citation data; a preprint server is a distribution platform, not a journal, so no SJR score exists for bioRxiv as a whole, and any figure circulating under “bioRxiv impact factor” searches is not an NLM, Crossref, or Scimago-sourced metric.

    Indexing also does not substitute for compliance. NLM is explicit that even when a preprint sits in PMC under the pilot, the NIH Public Access Policy still requires the peer-reviewed, accepted author manuscript to be separately deposited via NIHMS, with its own PMCID reported as proof of compliance.

    Answer-first questions about bioRxiv and PubMed

    Does bioRxiv show up in PubMed?

    Yes, but only conditionally. A bioRxiv preprint appears in PubMed only if it acknowledges direct NIH funding or lists an NIH-affiliated author and was posted under Phase 2 of the NIH Preprint Pilot (from 1 January 2023). Non-NIH preprints stay discoverable via Google Scholar, Crossref, and Europe PMC instead.

    What is a preprint in PubMed?

    In PubMed, a preprint is a record carrying the publication type “Preprint,” which separates it from peer-reviewed literature in search filters. It displays a yellow information panel stating the work has not undergone peer review, and PubMed links it automatically to the journal version once one is published.

    Does bioRxiv count as published?

    No. bioRxiv distributes complete but unpublished manuscripts, so posting there is not equivalent to journal publication. A preprint carries a DOI and is part of the citable record, but it lacks the peer-review certification that ICMJE and COPE norms attach to a published article.

    Is it okay to cite bioRxiv?

    Yes. bioRxiv preprints receive a DOI through Crossref, making them formally citable, and are indexed by Google Scholar, Crossref, Semantic Scholar, and Europe PMC. Authors citing them should flag that the underlying findings have not yet completed peer review.

    Why other funders are watching the pilot

    NIH’s approach is unusual because it is infrastructural rather than a mandate: it does not require authors to preprint, it simply makes eligible preprints easier to find once posted. That distinction is why other funders are studying it rather than replicating it wholesale.

    cOAlition S, the funder coalition behind Plan S, already treats preprints as an acceptable route to satisfying immediate open-access requirements, but no cOAlition S member currently operates an equivalent centralised indexing pipeline into a national biomedical database. UKRI’s open access policy similarly recognises preprints as compliant interim outputs without building comparable PMC-style ingestion.

    For research administrators, the practical takeaway is that discoverability infrastructure and funder mandates remain two separate policy levers. NIH has built the first at meaningful scale; whether other national funders follow with their own PMC-equivalent indexing pipeline — rather than policy language alone — is the open question institutions tracking preprint compliance should watch through 2026 and beyond.

  • openRxiv Explained: Why bioRxiv and medRxiv Went Independent

    openRxiv is the independent, researcher-led nonprofit that has run bioRxiv and medRxiv since March 2025, replacing Cold Spring Harbor Laboratory’s institutional stewardship with a six-member board, diversified funding, and a mandate to keep both preprint servers free to read and free to post. The spin-off was designed to insulate two of biomedicine’s most-used pieces of open-research infrastructure from dependence on any single institution or funder — a governance question every standards body and infrastructure provider eventually has to answer.

    openRxiv is the independent nonprofit, launched on 11 March 2025, that now stewards the bioRxiv and medRxiv preprint servers on behalf of the global research community, rather than as a programme of a single host institution.

    What is openRxiv, and what does it actually run?

    openRxiv is the organisational and legal home of two preprint servers: bioRxiv, covering life sciences, and medRxiv, covering health and clinical research. Neither server changed its submission process, screening policy, or URL when the transition happened — researchers post to biorxiv.org and medrxiv.org exactly as before.

    What changed is who is accountable for the platforms’ survival. bioRxiv was founded in 2013 at Cold Spring Harbor Laboratory (CSHL); medRxiv followed in 2019 as a joint initiative between CSHL, Yale University, and BMJ. Both grew into the dominant preprint venues for biomedicine, and by 2025 that success had outgrown the administrative capacity of a single laboratory to sustain indefinitely.

    Why did bioRxiv and medRxiv leave Cold Spring Harbor Laboratory?

    CSHL’s own account of the move calls it a “natural evolution,” not a rupture. Bruce Stillman, CSHL’s President and CEO, joined openRxiv’s board rather than severing ties, and co-founders John Inglis and Richard Sever moved with the platforms into the new entity.

    The stated rationale centres on three risks that concentrated stewardship inside one institution:

    • Sustainability risk — a single laboratory’s budget cycle is not designed to guarantee decades of continuity for global research infrastructure.
    • Governance risk — decisions about screening policy, features, and funding priorities benefited from a board drawn from outside CSHL alone.
    • Funder-concentration risk — the platforms needed a structure that could accept diversified funding without any one funder gaining outsized influence.

    openRxiv formally launched as an independent nonprofit on 11 March 2025, with the Chan Zuckerberg Initiative (CZI) providing three years of seed funding for the transition, according to openRxiv’s own governance Q&A published that May. In October 2025, arXiv — the physics, mathematics, and computer science preprint server run by Cornell University — joined openRxiv in submitting a joint response to a National Institutes of Health Request for Information on preprints, signalling a wider coalition forming around shared preprint-infrastructure interests, though arXiv itself remains a separate service.

    Who governs openRxiv, and who pays for it?

    openRxiv is governed by a six-member board of directors: Scott Fraser (University of Southern California and the CZI Imaging Institute), Edith Heard (Francis Crick Institute), Jeff Huber (Triatomic Capital), Harlan Krumholz (Yale School of Medicine; medRxiv co-founder), Bruce Stillman (CSHL), and Shirley Tilghman (Princeton University). A separate Scientific and Medical Advisory Board, chaired by John Inglis with medRxiv co-founder Theo Bloom as deputy, advises on content policy.

    The funding question is where most scrutiny has landed, given CZI’s long involvement with both servers before the spin-off:

    Question openRxiv’s public answer (governance Q&A, May 2025)
    How long has CZI funded the servers? Eight years for bioRxiv, four years for medRxiv, plus three years of dedicated seed funding for the openRxiv transition itself.
    Does CZI have editorial or operational control? No. openRxiv states funding agreements carry no stipulations affecting editorial or operational independence.
    How much board influence does CZI hold? One of six directors (Scott Fraser) has a CZI affiliation; the board is not CZI-appointed as a bloc.
    Is openRxiv against traditional peer review? No — openRxiv reports roughly 75% of bioRxiv and medRxiv preprints go on to formal peer-reviewed publication, with direct-submission links to 350 journals.

    openRxiv itself frames the governance model as a direct answer to funder-concentration concerns: the organisation states its mission is to be “governed by and for the research community, not a single funder, founder, or any one stakeholder.” Whether a philanthropic vehicle tied to a single tech-sector family remains structurally sufficient as the largest funder of a nonprofit intended to resist single-funder capture is a debate that predates this specific spin-off and will likely recur as openRxiv pursues its stated goal of diversifying revenue further.

    What is openRxiv Labs, and what launched in June 2026?

    openRxiv Labs launched on 1 June 2026 as a structured experimentation programme sitting on top of the core bioRxiv and medRxiv infrastructure. Rather than running many small tests at once, openRxiv committed to a small number of larger, hypothesis-driven pilots with predefined success metrics and durations, publishing results — including failures — openly on a dedicated Labs blog.

    The first Labs pilot, built with the platform Curvenote, tests an interactive preprint-reading interface layered onto openRxiv’s existing corpus of preprints, figures, and metadata. openRxiv named a broad partner list for the programme, including CZI, CSHL, the Sergey Brin Family Foundation, Caltech, CNRS, Fred Hutchinson Cancer Center, Imperial College London, MIT, Stanford, the University of Washington, and Vrije Universiteit Amsterdam — underscoring that the funder-diversification effort begun at launch has continued into 2026 rather than stalling after the initial CZI seed grant.

    Answer-first questions people are asking about openRxiv

    Who is the CEO of openRxiv?

    Dr Tracy Teal is openRxiv’s first Chief Executive Officer, appointed on 18 August 2025 after serving as interim COO since the March 2025 launch. She previously led The Carpentries and Dryad, two established open-research infrastructure nonprofits, giving her direct prior experience running community-governed scientific platforms.

    Who owns medRxiv?

    No single institution “owns” medRxiv today. It was founded in 2019 by Cold Spring Harbor Laboratory, Yale University, and BMJ, but operational and governance responsibility now sits with openRxiv, the independent nonprofit created specifically to steward it and bioRxiv without institutional or single-funder control.

    Is medRxiv a credible source?

    medRxiv preprints are screened but not peer-reviewed, so they should be cited with that caveat clearly stated. openRxiv reports around 75% of postings eventually complete formal peer review; until then, findings represent unverified claims from qualified researchers, useful for rapid awareness but not equivalent to a published, peer-reviewed article.

    What is openRxiv, in one line?

    openRxiv is the independent 501(c) nonprofit, launched 11 March 2025, that operates bioRxiv and medRxiv under a six-member board and a diversified-funding mandate, replacing their prior status as programmes hosted by Cold Spring Harbor Laboratory.

    What the openRxiv spin-off means for research-infrastructure stewardship

    The openRxiv case is a useful reference point for any organisation weighing how to govern shared research infrastructure once it outgrows its founding institution. The pattern — an originating body incubates a tool, the tool becomes essential community infrastructure, and stewardship then transfers to an independent, multi-stakeholder body — is not unique to preprints.

    CASRAI originated the CRediT contributor role taxonomy in 2014. The standard is now stewarded by NISO as ANSI/NISO Z39.104-2022. That is the same “originator, not owner” pattern openRxiv is now navigating in public: CSHL originated bioRxiv and medRxiv, and stewardship has since passed to a body structured explicitly to prevent any one funder, founder, or institution from controlling research infrastructure the whole field depends on.

    For research administrators and institutional leaders, the practical takeaway is to watch governance structure, not just funding source, when assessing an infrastructure provider’s long-term reliability. A named, multi-institutional board; published funding-independence commitments; and open reporting of pilot outcomes (as with openRxiv Labs) are the concrete signals worth checking — independent of who wrote the first cheque.