Tag: self-plagiarism checker

  • Plagiarism Detection: iThenticate vs Turnitin

    Plagiarism detection software for research integrity offices splits into two distinct product lines from the same corporate family: iThenticate, built for pre-publication manuscript and dissertation screening against scholarly literature, and Turnitin, built for coursework screening against a global repository of student papers. Choosing between them depends on what is being screened — a manuscript bound for a journal, or a student thesis — not on which tool has the higher marketing score.

    Plagiarism detection software is a text-similarity system that compares a submitted document against a reference database — web pages, journal articles, or previously submitted papers — and returns a similarity report for human review, not an automated verdict of misconduct.

    What are iThenticate and Turnitin, and how do they differ?

    iThenticate and Turnitin are sibling products of Turnitin, LLC (formerly iParadigms), sharing an underlying text-similarity engine but built for different markets. iThenticate targets researchers, faculty and publishers screening manuscripts, theses and grant applications before submission or publication, while Turnitin targets instructors screening student coursework, typically through a learning-management-system integration such as Canvas or Blackboard.

    Turnitin was acquired by Advance Publications, the media group that also owns Condé Nast, in 2019 for a reported $1.75 billion — a detail worth knowing because both product lines now sit under one commercial parent, which shapes how licensing bundles and feature roadmaps (including AI-writing detection) are rolled out across the two tools.

    How does grey-literature and database coverage compare?

    Coverage is the single biggest practical differentiator for a research integrity office, and it is where most consumer-facing “best plagiarism checker” roundups say nothing useful, because they compare tools built for essays, not manuscripts.

    iThenticate’s index is weighted toward licensed scholarly publisher content rather than student submissions. A large share of that access runs through Crossref’s Similarity Check service (originally launched as CrossCheck in 2008), which lets participating publishers cross-reference manuscripts against one another’s published content using the iThenticate engine. Turnitin’s index, by contrast, is anchored by its own repository of previously submitted student papers, built up over two decades of institutional use — a strength for catching student-to-student collusion, but a weaker signal for detecting overlap with the peer-reviewed literature.

    Neither tool has comprehensive built-in coverage of grey literature — preprint servers such as arXiv, bioRxiv and SSRN, institutional repositories, conference proceedings, or non-English regional journals — by default. Research integrity offices handling multidisciplinary manuscripts should treat both as a first-pass screen and budget for supplementary manual searches for grey-literature-heavy submissions, particularly in physics, computer science and economics, where preprint-first publishing norms are strongest.

    Factor iThenticate Turnitin
    Primary audience Researchers, faculty, publishers Instructors, students
    Core database strength Scholarly publisher content via Crossref Similarity Check Global repository of student-submitted papers
    Typical workflow entry point Stand-alone web app or publisher submission system LMS integration (Canvas, Blackboard, Moodle)
    Submission repository add-back Private, user-managed folders by default Papers commonly added to the global student repository
    AI-writing detection Added with iThenticate 2.0 (2024) Live since April 2023
    Grey literature / preprint coverage Limited; not comprehensive by default Limited; not comprehensive by default

    How should offices handle false positives and similarity scores?

    A high similarity score is not, by itself, evidence of plagiarism, and treating it as one is the most common misuse of these tools by inexperienced reviewers.

    Both engines flag methods sections, standard nomenclature, ethics-declaration boilerplate, direct quotations and reference lists as “matches” even when correctly cited — a false-positive pattern that is worse in STEM disciplines with formulaic methods language. The Committee on Publication Ethics (COPE) has published discussion guidance on text recycling warning that similarity percentages must be interpreted by a human reviewer against citation context, not treated as an automated pass/fail gate. The ICMJE Recommendations similarly treat overlapping and duplicate publication as an editorial judgement matter, not a software output.

    Practical guardrails research integrity offices commonly apply:

    • Exclude quotations and bibliography from the headline similarity score, and review flagged matches individually rather than acting on the aggregate percentage.
    • Use a similarity band (many institutions apply an initial screening range around 15–20% overall similarity, excluding quotes and references) purely as a triage trigger for closer human review — never as an automatic misconduct threshold.
    • Distinguish self-plagiarism (recycled text from a researcher’s own prior publications) from third-party plagiarism; the two require different institutional responses and different policy citations.
    • Route AI-writing-detection flags through the same human-review step as similarity flags — both tools’ AI detectors are probabilistic classifiers, not proof of misconduct, and both vendors publish accuracy caveats for their own models.

    What are the institutional licensing differences?

    Licensing is negotiated at institutional or publisher level for both tools, not purchased per document — a common procurement mistake is assuming individual-researcher pricing applies.

    iThenticate is typically licensed directly to a research office, graduate school or publisher, with API integration into manuscript-submission platforms (such as Editorial Manager or ScholarOne) the standard deployment pattern for journals and university presses. Turnitin is typically licensed at whole-institution level through the teaching and learning technology budget and bundled with the LMS, so a research integrity office wanting Turnitin for thesis screening often negotiates access through that existing contract rather than procuring it independently. Offices should confirm whether AI-writing detection, added by both vendors as a distinct module rather than a default feature, is included in the base licence or billed separately.

    Which tool should a research integrity office choose?

    For pre-publication manuscript, dissertation and grant-application screening, iThenticate is the better-fitted tool: its scholarly-publisher-weighted database and default private-folder handling protect the confidentiality of unpublished work, which matters when a submission may still be under active peer review elsewhere.

    For undergraduate and taught-postgraduate thesis screening, where the goal includes both integrity checking and student education, Turnitin’s LMS-integrated workflow and student-paper repository are the better fit, provided the office has a clear policy on whether submissions are added to the global repository — a live consideration for a thesis that a student may later adapt into a journal article.

    Many research-intensive institutions run both: iThenticate for faculty and doctoral output heading toward publication, Turnitin for coursework and taught-programme theses, coordinated through the same research integrity office policy rather than treated as competing tools solving the same problem.

    Answer-first Q&A

    What is the best plagiarism detection software for research integrity offices?

    There is no single “best” tool; the right choice depends on document type. iThenticate is better suited to pre-publication manuscripts, dissertations and grant applications because of its scholarly-database weighting and confidential handling. Turnitin is better suited to coursework and taught theses because of its LMS integration and student-paper repository.

    Which software can detect plagiarism in grey literature?

    Neither iThenticate nor Turnitin offers comprehensive built-in coverage of preprints, institutional repositories or conference proceedings. Research integrity offices reviewing grey-literature-heavy submissions should supplement automated screening with manual searches of preprint servers such as arXiv, bioRxiv and SSRN, particularly in disciplines with strong preprint-first publishing norms.

    Does Turnitin detect AI writing?

    Yes. Turnitin’s AI writing detection feature has been live since April 2023 and is integrated into its standard similarity report. iThenticate gained an equivalent capability with the iThenticate 2.0 platform release in 2024. Both vendors publish caveats that AI-detection scores are probabilistic, not definitive proof of AI authorship.

    Can I check Turnitin submissions for free?

    No. Turnitin does not offer a free public checking tier; access requires an institutional licence, typically bundled into an institution’s learning-management-system contract. Individual researchers or offices without an existing institutional subscription cannot submit documents to Turnitin directly.

    Implications for research integrity offices

    The practical decision is a policy question before it is a procurement one: an office needs a written position on similarity-score thresholds, self-plagiarism handling, repository add-back consent, and AI-detection escalation before either tool is deployed at scale — otherwise reviewers apply inconsistent judgement to functionally identical reports. Coordinating with the institution’s research administration function on licensing, and with policy on authorship disputes where overlap flags intersect with contested co-authorship claims, keeps the tool’s output anchored to institutional policy rather than treated as a standalone verdict.

    As both vendors extend AI-writing detection and publishers expand Crossref Similarity Check participation, the coverage gap between “student work” and “scholarly literature” databases is likely to narrow — but grey literature will remain the persistent blind spot for the foreseeable future, and no procurement decision should assume otherwise.

  • Self-Plagiarism Policy: A Research Office Guide

    Self-plagiarism is the reuse of an author’s own previously published words, data or findings in a new submission without disclosure, and duplicate publication is its most serious form: submitting substantially the same paper to a second journal or assessment exercise as if it were original. Research offices need a written policy because UK universities define and detect the practice differently from how journals, guided by the Committee on Publication Ethics (COPE), handle it after publication.

    Self-plagiarism is the reuse or recycling of one’s own previously disseminated text, data or ideas without acknowledging their prior publication, a definition close to that used by the University of Glasgow’s postgraduate research code of practice. It is not theft of someone else’s intellectual property, but it can mislead readers about the novelty of the work and, where copyright has passed to a publisher, breach that publisher’s rights.

    What is self-plagiarism and how does it differ from duplicate publication?

    Self-plagiarism covers several distinct behaviours, and a policy that treats them as one problem will under- or over-punish researchers. Miguel Roig’s widely cited taxonomy, referenced in University of Glasgow guidance, separates duplicate publication (the same paper submitted to two venues), redundant publication (substantial overlap with limited new content), augmented publication or “meat extension” (a small data addition dressed up as a new study) and segmented publication, also called salami-slicing, where one dataset is fragmented into several minimal papers.

    • Text recycling — reusing methods or literature-review paragraphs verbatim across papers, generally treated less severely if disclosed.
    • Duplicate (redundant) publication — near-identical results, data and conclusions appearing in two outlets without cross-reference.
    • Salami slicing — splitting one study into the minimum publishable unit to inflate an author’s output count.
    • Double-dipping — submitting one piece of coursework for credit in two modules or degrees, the most common student-facing case.

    The common thread across all four is the absence of a clear statement telling the reader, editor or examiner that the material has appeared before.

    How do UK universities define and detect self-plagiarism?

    UK institutions typically fold self-plagiarism into their academic integrity or research misconduct code rather than issuing a stand-alone policy. The University of Glasgow’s postgraduate research code of practice defines it as republishing “a work in its entirety or reuses portions of a previously written text while authoring a new work,” explicitly flagging the copyright-infringement risk. Detection at the institutional level is almost always automated.

    Text-matching software, principally Turnitin, retains a database of every previous submission by a student or researcher within a subscribing institution, so recycled coursework, theses or preprints are matched even when no external source is involved. This is a structural difference from journal-side detection, which typically relies on CrossCheck/iThenticate comparisons against the published literature rather than an institution’s own submission archive — meaning a student’s unpublished prior assignment may be invisible to a journal but immediately flagged by a university’s own repository.

    Institutional handling also sits inside a wider governance structure: UK signatories to the Concordat to Support Research Integrity, coordinated by Universities UK, commit to transparent, proportionate misconduct procedures that must cover publication practices, including duplicate and redundant publication, not only fabrication and falsification.

    Where do COPE rules diverge from institutional academic-integrity rules?

    Journals and universities are answering different questions, which is why the same manuscript can trigger two separate, non-identical processes. A journal editor following COPE guidance is asking whether the scholarly record needs correcting; a research office is asking whether an individual has breached a code of conduct.

    Dimension COPE / journal response Institutional / research office response
    Primary concern Integrity of the published record and reader transparency Conduct of the individual against the institution’s code
    Typical trigger Editor or reviewer recognises overlapping text or data post-submission Text-matching software flags a submission at intake (thesis, assignment, grant report)
    Guidance used COPE flowcharts on suspected redundant (duplicate) publication; ICMJE recommendations on overlapping publications Institutional academic integrity policy, often referencing the UK Concordat to Support Research Integrity
    Possible outcomes Correction, expression of concern, or retraction; author notified via publisher Formal warning, mark penalty, mandatory training, or referral to a misconduct panel
    Who acts Journal editor, in consultation with the publisher and, where relevant, the author’s institution Research integrity officer or academic conduct committee

    COPE’s own case files show the practical effect of this divide: its published discussion of a self-plagiarism case notes forum members expressing sympathy for an author while still requiring correction of the record, because the editorial remedy (correcting readers) is independent of any judgement about the author’s intent. A research office cannot outsource its own disciplinary decision to a journal’s correction, and a journal cannot substitute for an institution’s misconduct process — each must run its own track, and a policy that assumes one covers the other will leave gaps.

    What should a research office’s self-plagiarism policy include?

    A workable policy needs to do more than restate the definition. Based on the divergence set out above, a research office policy should:

    1. Define self-plagiarism with named sub-categories (duplicate, redundant, augmented, segmented publication) rather than a single umbrella term, so cases are classified consistently.
    2. State explicit exceptions — conference-to-journal expansion, translations for non-English audiences, and plain-language summaries — provided each carries a disclosure statement to editors and, where relevant, examiners.
    3. Set out the detection method (which text-matching tool, what similarity threshold triggers review) so staff and students know how matches are surfaced.
    4. Separate the reporting line for student cases (module leader or academic conduct office) from staff/researcher cases (research integrity officer), since the applicable code differs.
    5. Require researchers to disclose prior publication of overlapping material to editors at submission, mirroring ICMJE’s overlapping-publication recommendations, rather than leaving disclosure to be discovered.
    6. Reference COPE’s flowcharts on suspected redundant publication as the institution’s expected response when a journal makes contact about one of its authors.

    Policies that omit the disclosure-based exceptions tend to produce the most complaints, because researchers legitimately reworking a conference paper, a thesis chapter or a policy briefing for a new audience are treated identically to authors concealing duplicate results.

    Common questions about self-plagiarism

    What is an example of self-plagiarism?

    A common example is submitting the same coursework essay for credit in two different modules, or publishing a near-identical dataset and discussion in two journals without cross-referencing the earlier paper. Both withhold from the reader or marker that the work has been previously submitted or published.

    Will Turnitin detect self-plagiarism?

    Turnitin can detect self-plagiarism because it retains a searchable archive of every prior submission made within a subscribing institution. A student’s own earlier assignment, thesis draft or conference paper will typically be flagged as a text match, even where no external plagiarism has occurred.

    What happens if a researcher is found to have self-plagiarised?

    Outcomes depend on which track applies. Journals follow COPE guidance and may issue a correction, expression of concern or retraction; universities apply their own academic integrity code, which can range from a formal warning to referral to a research misconduct panel for staff, or a failed module for students.

    How can researchers avoid self-plagiarism when reusing their own work?

    Researchers should cite their own prior work as they would any other source, disclose overlapping material to editors in a cover letter, and obtain explicit permission from co-authors, supervisors or publishers before reusing substantial text, data or figures in a new submission.

    Implications for research offices

    The gap between journal-side and institution-side handling is not a loophole; it is two accountability systems answering different questions about the same document. A research office that documents this distinction explicitly — rather than assuming a COPE correction closes the institutional file — will resolve cases faster and more consistently.

    As text-matching tools extend coverage to preprint servers, thesis repositories and grant reports, more duplicate-publication cases will surface at intake rather than after publication. Institutions that name detection thresholds, disclosure exceptions and dual reporting lines now will handle that shift better than those relying on a general-purpose plagiarism clause.

    For related institutional context, see CASRAI’s overview of research administration practice and its explainer on authorship criteria, which intersects with duplicate-publication disputes over who is entitled to reuse shared material.