Chai-2 bioRxiv: Comparing AI Biology Preprints Ahead of Peer Review

The Chai-2 bioRxiv preprint, posted by Chai Discovery on 5 July 2025, reports a 16% hit rate in fully de novo antibody design — more than 100-fold above prior computational methods — but like the ESM3 and Geneformer foundation models it sits alongside, the claim has not yet cleared peer review. All three are part of a wider pattern: AI biology foundation models are increasingly disseminated as bioRxiv preprints first, journal articles later (if at all), which changes how institutions, publishers, and funders must scrutinise their claims.

A bioRxiv preprint is a manuscript posted to the Cold Spring Harbor Laboratory’s biology preprint server before, or instead of, formal peer review. This article compares how Chai-2, ESM3, Geneformer, EvolvePro, and AlphaFold-Multimer have each used that route, and what the differences mean for reproducibility.

What is Chai-2, and why was it posted as a bioRxiv preprint?

Chai-2 is a multimodal generative model from Chai Discovery that designs antibodies and nanobodies from scratch, taking a target structure and epitope as input and returning a complete antibody design. The original preprint, “Zero-shot antibody design in a 24-well plate”, reported a 16% success rate in de novo design against 52 diverse targets, completed from AI design to wet-lab validation in under two weeks.

Chai Discovery followed with an updated bioRxiv preprint on 29 November 2025, “Drug-like antibody design against challenging targets”, reporting that more than 86% of designed full-length monoclonal antibodies showed developability profiles comparable to approved therapeutics. Neither preprint has yet been published in a peer-reviewed journal. The company has since raised a $130 million Series B round, taking total funding above $225 million at a $1.3 billion valuation, according to Genetic Engineering & Biotechnology News.

How do ESM3 and Geneformer differ from Chai-2 in preprint dissemination?

ESM3 and Geneformer address different biological scales entirely, and their publication paths diverge from Chai-2’s in an instructive way. ESM3, from EvolutionaryScale, is a general-purpose protein language model trained on roughly 2.78 billion protein sequences with a 98-billion-parameter flagship configuration. It was posted as a preprint before its 2025 publication in Science — meaning it eventually completed the peer-review cycle that Chai-2’s antibody preprints have not yet reached.

Geneformer operates at the cellular level rather than the molecular level. Built on a transformer-encoder architecture pretrained across tens of millions of single-cell RNA-sequencing profiles, it classifies cell types and predicts disease-relevant genes. Its foundational description, credited to Christina Theodoris and colleagues, circulated as a preprint before formal publication in Nature in 2023.

EvolvePro and AlphaFold-Multimer extend the comparison further. EvolvePro is a few-shot protein-engineering framework that uses language-model embeddings to guide directed evolution from very few labelled variants, disseminated via bioRxiv. AlphaFold-Multimer, Google DeepMind’s extension of AlphaFold2 for multi-chain complex prediction, is the starkest case: its 2021 bioRxiv preprint (Evans et al.) has been cited thousands of times and underpins structural biology workflows worldwide, yet it has never been published in a peer-reviewed journal.

Model Domain bioRxiv posting Weight access Peer-review status
Chai-2 De novo antibody design v1 Jul 2025; updated Nov 2025 Platform/API access, not fully open weights Preprint only
ESM3 General protein sequence/structure/function Preprint, then Science (2025) Smaller checkpoints open; 98B flagship gated via Forge API Peer-reviewed
Geneformer Single-cell transcriptomics Preprint, then Nature (2023) Fully open-weight release Peer-reviewed
EvolvePro Few-shot directed protein evolution bioRxiv preprint Open code/model release Preprint at time of posting
AlphaFold-Multimer Multi-chain complex structure prediction bioRxiv preprint (2021) Code and weights open-sourced Never published in a peer-reviewed journal

Why does preprint-first publication intensify reproducibility scrutiny?

Preprint-first publication compresses the interval between a headline result and its public citation, which is valuable for fast-moving fields but removes a layer of independent verification before claims circulate. AlphaFold-Multimer shows this can persist indefinitely: a preprint can become de facto infrastructure without ever completing formal review.

  • Model weight access varies sharply: Geneformer and AlphaFold-Multimer are fully open, while Chai-2 and ESM3’s largest configuration require platform or API access, limiting independent replication of the exact reported result.
  • Benchmark scale differs: Chai-2’s 16% hit rate is drawn from a company-run benchmark across 52 targets, not an externally adjudicated challenge such as CASP or CAPRI.
  • Versioning matters: Chai-2’s updated November 2025 preprint extends claims to full-length monoclonal antibodies, meaning readers must track which version underlies any given statistic.

For research administrators and institutional evaluators, the practical implication is that a citation to “Chai-2” or “ESM3” is not self-evidently a citation to peer-reviewed work — the preprint status, version, and weight-access terms all need checking before the claim is treated as settled.

Common questions about AI biology preprints on bioRxiv

Is the Chai-2 bioRxiv preprint peer-reviewed?

No. As of publication, both Chai-2 preprints — the July 2025 original and the November 2025 update — remain bioRxiv preprints. Neither has completed formal peer review, so the reported 16% hit rate and 86% developability figures should be read as company-reported, not journal-vetted, results.

Has ESM3 been published in a peer-reviewed journal?

Yes. ESM3 was first circulated as a preprint before EvolutionaryScale’s results were published in Science in 2025, giving it a completed peer-review path that Chai-2’s antibody-design claims currently lack.

What is Geneformer used for?

Geneformer analyses single-cell RNA-sequencing data to classify cell types, model gene regulatory networks, and identify disease-relevant genes, using a transformer architecture trained on large single-cell transcriptome corpora rather than protein or antibody sequences.

What is the difference between Chai-2 and AlphaFold-Multimer?

AlphaFold-Multimer predicts the 3D structure of existing multi-chain protein complexes, while Chai-2 generates entirely new antibody sequences and structures for a chosen target — structure prediction versus de novo generative design.

What are the implications for institutions, publishers, and funders?

Research administrators citing Chai-2, ESM3, Geneformer, or comparable models in grant reports, technology assessments, or institutional communications should distinguish preprint claims from peer-reviewed findings explicitly, note the exact preprint version, and record whether model weights are open or platform-gated. Publishers and editors evaluating manuscripts that build on these models should likewise verify which version of the underlying preprint is cited, since headline metrics can shift between versions.

The broader lesson is structural rather than model-specific: as AI biology moves faster than journal review cycles, the preprint-to-journal gap itself becomes a due-diligence checkpoint that institutions, funders, and publishers now need to track as routinely as they track the results themselves.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *