Definition · Plain-language
Data integrity
Data integrity is the degree to which data are complete, consistent, accurate and trustworthy throughout their entire lifecycle, from generation through processing and reporting to archiving.
The step most authors miss
Doing CRediT right? Don’t stop at the statement.
A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.
Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.
Why data integrity is foundational
Regulators do not see a medicine being made; they see the records that describe how it was made and tested. If those records cannot be trusted, the regulatory decision built on them cannot be trusted either. Data integrity is therefore foundational to GxP: it is the property that allows authorities, and patients, to rely on the evidence of quality. A serious data-integrity failure — falsified results, missing records, manipulated audit trails — can invalidate an entire submission, irrespective of whether the physical product was actually sound.
The data lifecycle
Data integrity must be maintained across the whole data lifecycle, not just at the moment of capture. The lifecycle spans generation and recording, processing and transformation, review and reporting, and finally retention, retrieval and eventual disposal. Risks differ at each stage: original data may be lost on capture, altered during processing, or become unreadable in storage. Strong data governance addresses every stage, ensuring original records are preserved, changes are tracked through audit trails, and data remain available and legible for as long as required.
Common data-integrity failures
Inspection findings reveal recurring failure modes: shared login accounts that break attribution, the ability to overwrite or delete original data, audit trails that are disabled or not reviewed, recording results on scrap paper and transcribing only favourable ones, and back-dating entries. Many failures are organisational rather than technical, arising from pressure, poor system design or weak oversight. This is why guidance from the MHRA, WHO and PIC/S stresses data-governance culture and management responsibility alongside technical controls.
Key facts
At a glance
- Definition: degree to which data are complete, consistent, accurate and trustworthy across their lifecycle
- Why it matters: regulatory decisions rest entirely on trustworthy data
- Assessed against: the ALCOA / ALCOA+ principles
- Lifecycle stages: generation, processing, reporting, retention and disposal
- Key guidance: MHRA, WHO, PIC/S and FDA data-integrity guidance
- Common failures: shared logins, disabled audit trails, selective recording
Common misconceptions
What people often get wrong
Often heard: Data integrity is only about preventing deliberate fraud.
Actually: Fraud is one risk, but many data-integrity failures are unintentional — shared logins, unreviewed audit trails, lost original records or poor system design. Guidance therefore stresses data-governance culture and controls across the whole lifecycle, not just anti-fraud measures.
Often heard: If the final report is accurate, data integrity is satisfied.
Actually: Data integrity must hold across the entire lifecycle, including original capture, processing and retention. An accurate-looking report built on selectively transcribed or altered source data still represents a serious integrity failure.
Often heard: Data integrity is a purely technical, IT problem.
Actually: Technical controls matter, but many failures are organisational — pressure, weak oversight, poor practices. Effective data integrity depends on management responsibility and a sound data-governance culture as much as on systems.
Going deeper







