Skip to main content
v2026.1714 entries · CC-BY 4.0
CASRAI

Definition · Plain-language

Data quality

Data quality is the degree to which data is fit for its intended purpose, assessed across dimensions such as accuracy, completeness, consistency, timeliness, validity and uniqueness.

CASRAI research-methods explainer — Data quality

The step most authors miss

Doing CRediT right? Don’t stop at the statement.

A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.

Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.

The dimensions of data quality

Data quality is conventionally broken into measurable dimensions. Accuracy is whether data correctly describes the real world; completeness is whether required values are present; consistency is whether data agrees across systems; timeliness is whether it is current enough for its use; validity is whether it conforms to defined formats and rules; and uniqueness is whether the same entity is recorded only once. Assessing data against these dimensions turns a vague notion of good data into something that can be measured and improved.

Fitness for purpose

A defining principle of data quality is that it is relative to use. The same dataset may be perfectly adequate for a high-level trend report yet unfit for individual customer billing. Quality is therefore judged against the requirements of the task, not against a single absolute standard. This is why data quality begins with agreeing the requirements and acceptable thresholds for each use, then measuring how well the data meets them — work that depends on clear definitions from the data dictionary.

Managing data quality

Sustaining data quality is a continuous discipline, not a one-off cleanse. It involves profiling data to find issues, defining rules and thresholds, monitoring quality over time and remediating problems at source where possible. Data stewards typically own quality for their domain, supported by governance policies and quality dashboards. Because errors often originate upstream, lasting improvement comes from fixing root causes in how data is captured, rather than repeatedly cleaning the same downstream defects.

Key facts

At a glance

  • Definition: the degree to which data is fit for its intended purpose
  • Dimensions: accuracy, completeness, consistency, timeliness, validity, uniqueness
  • Key principle: quality is relative to the intended use
  • Owned by: data stewards, under governance policy
  • Approach: profile, define rules, monitor, remediate at source
  • Related standard: ISO 8000 data quality

Common misconceptions

What people often get wrong

Often heard: Data quality just means the data has no obvious errors.

Actually: It is multidimensional. Data can be error-free yet incomplete, out of date or inconsistent across systems. Quality is assessed across several dimensions, including timeliness, consistency and uniqueness.

Often heard: There is one absolute standard of good-quality data.

Actually: Quality is fitness for purpose, so it is relative to use. Data adequate for a trend report may be unfit for billing; thresholds are set against the requirements of each task.

Often heard: A one-off data cleanse fixes data quality permanently.

Actually: Quality degrades as new data arrives and systems change. Lasting improvement requires continuous monitoring and fixing root causes at the point of capture, not repeated downstream cleansing.

Referenced across the research world

University of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logoUniversity of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logo
  • University of Cambridge logo
  • Columbia University logo
  • University of Edinburgh logo
  • Harvard University logo
  • University of Oxford logo
  • Princeton University logo
  • Stanford School of Medicine logo
  • University College London logo
  • ORCID logo
  • Crossref logo

View CASRAI adoption →