A paper mill is a business. For a fee, it supplies fabricated or manipulated manuscripts, and slots of authorship on them, to researchers who need publications to satisfy career or funding pressures. The product is not a single forged paper but an industrialised pipeline that produces them by the hundred, and the result is a slow contamination of the scholarly record with work that was never genuinely carried out. Understanding how mills operate, how the community is responding, and why catching them is so difficult is now a core part of research integrity.
How paper mills operate
The mechanics are mundane and commercial. A mill drafts manuscripts, often by recombining and superficially altering existing work, generating plausible-looking figures, and slotting in data that may be wholly invented. It then sells positions on the author list, sometimes advertising openly on social platforms and messaging apps, with prices varying by the prestige of the target journal and the seniority of the authorship position. Some mills also offer to manufacture peer reviews, supplying suggested reviewers who are in fact controlled by the mill, so that a submitted manuscript is shepherded through review by the very organisation that produced it.
Because the output is generated at scale, mills leave statistical fingerprints. Many manuscripts share templated structures, near-identical phrasing, recycled figures with small modifications, and sudden additions of unrelated authors. The economics reward volume, so a single mill may be responsible for a startlingly large batch of papers spread across many journals and submitted over a short window.
The tell-tale signs
Several recurring signals have become recognised markers of mill-produced or otherwise fabricated work:
- Tortured phrases. To evade plagiarism detectors, text is run through paraphrasing tools that replace standard terms with bizarre synonyms. Established phrases mutate into nonsensical variants, so a reader encounters odd substitutions where a familiar technical term should be. These tortured phrases are a strong indicator that text has been mechanically laundered.
- Citation cartels and manipulation. Rings of authors, journals or mills cite one another to inflate metrics, producing citation patterns that serve reputation rather than relevance. Coordinated, reciprocal citation that does not match the intellectual content of the work is a warning sign.
- Image and data anomalies. Duplicated, spliced or reused images, and data that are too clean, internally inconsistent or statistically implausible, frequently accompany fabricated manuscripts.
- Authorship and contact irregularities. Unrelated co-authors, mismatched institutional affiliations, and non-institutional or suspicious contact email addresses can indicate purchased authorship.
The coordinated response
Because no single journal can see the whole pattern, the response has had to be collective. The STM Integrity Hub, an initiative of the International Association of Scientific, Technical and Medical Publishers, is a shared platform that lets publishers screen submissions against a common set of integrity checks and signals. By pooling indicators across many publishers, the Hub can surface a manuscript that appears benign in isolation but matches patterns seen elsewhere, such as duplicate submissions sent simultaneously to multiple journals, which is exactly the behaviour an industrial mill exhibits.
The Committee on Publication Ethics (COPE) provides the policy and process backbone. COPE has published guidance specifically on paper mills, developed in collaboration with STM, helping editors recognise the signs, decide how to investigate, and handle the corrections and retractions that follow. COPE’s flowcharts and guidance give editors a defensible procedure to follow when systematic manipulation is suspected, which matters because acting against fabricated work has to withstand scrutiny and, sometimes, legal challenge.
When a mill is exposed, the consequence is often a mass retraction. Rather than a single paper being withdrawn, publishers retract large batches at once as forensic analysis links many manuscripts to a common source. These coordinated retractions have become one of the more visible signs that the system is fighting back, and the reasons are recorded so that the corrected record is transparent about what happened and why.
Why detection is hard
It would be comforting to think fabricated work is obviously bad, but the opposite is closer to the truth. A competent mill produces manuscripts that read like ordinary, unremarkable papers, because looking ordinary is the entire point. Detection is hard for several reinforcing reasons. Individual journals see only their own submissions, so cross-publisher patterns are invisible without shared infrastructure. The signals are probabilistic, not definitive: a tortured phrase or an unusual citation can have innocent explanations, so a human must adjudicate rather than a filter rejecting automatically. Mills adapt quickly, changing tactics as soon as a detection method becomes known, which makes integrity screening a moving target rather than a solved problem. And proving fabrication, as opposed to merely suspecting it, demands careful, time-consuming investigation that editorial teams must resource.
The deeper driver is incentive. As long as careers, promotions and funding hinge on publication counts, demand for fabricated output persists, and supply rises to meet it. This is why integrity work connects to the broader reform of research assessment: reducing the reliance on raw publication numbers attacks the demand side of the market. Tools such as CRediT contributor roles, by making genuine contribution explicit and attributable, raise the cost of selling authorship to people who did nothing, and shared definitions of the kind found in the CASRAI data dictionary help systems exchange the integrity signals that detection depends on. The fight against paper mills is therefore part technical, part editorial and part cultural, and it will not be won by detection alone.







