Explainer · Plain-language
The Data Documentation Initiative: Definition, Meaning & Examples | CASRAI
The Data Documentation Initiative (DDI) is an international metadata standard for documenting data from the social, behavioural, and economic sciences. It provides a structured, XML-based way to describe surveys, microdata, and other research datasets in rich detail, including at the level of individual variables. DDI comes in two main flavours — DDI Codebook and DDI Lifecycle — and is widely used by major data archives such as the UK Data Service, ICPSR, GESIS, and the CESSDA network. Compared with simpler standards such as Dublin Core, DDI offers far greater depth for survey and microdata documentation.
The step most authors miss
Doing CRediT right? Don’t stop at the statement.
A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.
Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.
What DDI is and what it documents
The Data Documentation Initiative is an international standard for the metadata that describes research data in the social, behavioural, and economic sciences. Its purpose is to enable datasets — particularly surveys and other microdata — to be discovered, understood, used correctly, and preserved over time. A distinguishing strength of DDI is variable-level documentation. Beyond describing a dataset as a whole, DDI can document each variable in detail: the question that produced it, its response categories and codes, value labels, and summary statistics. This granularity is essential for secondary analysts who need to know precisely what each field means before reusing the data.
DDI Codebook and DDI Lifecycle
DDI exists in two principal forms. DDI Codebook is the more lightweight specification, suited to documenting a dataset largely at a single stage — typically capturing the contents of a study and its variables, much as a traditional study codebook would, in structured XML. DDI Lifecycle is a more comprehensive model designed to document data across the entire research life cycle, from study conception and instrument design through data collection, processing, distribution, archiving, and reuse. Lifecycle supports richer features such as reuse of metadata across studies and detailed process documentation, making it appropriate for complex, longitudinal, or repeated studies.
Who uses DDI
DDI is the de facto documentation standard across much of the social-science data infrastructure. Major archives and services use it to catalogue and disseminate their holdings, including the UK Data Service in the United Kingdom, ICPSR (the Inter-university Consortium for Political and Social Research) in the United States, and GESIS in Germany. It is also central to the CESSDA network — the Consortium of European Social Science Data Archives — whose members use DDI to describe and share data across countries. This shared standard supports interoperability between archives and makes cross-archive discovery and reuse of social-science data feasible.
DDI compared with Dublin Core
DDI is XML-based and is designed specifically for the depth required by survey and microdata documentation. This sets it apart from a general-purpose standard such as Dublin Core, which provides a small set of simple descriptive elements for resources of any kind. Dublin Core is excellent for lightweight, cross-domain resource description and discovery, but it cannot express the variable-level detail — question text, codes, categories, and the structure of a survey instrument — that DDI captures. For documenting social-science datasets thoroughly enough for reuse, DDI is far richer; the two standards address different needs and can be complementary, with Dublin Core supporting discovery and DDI providing the detailed documentation.
Key facts
At a glance
- Full name: Data Documentation Initiative
- Domain: Social, behavioural, and economic science data
- Format: XML-based metadata standard
- Variants: DDI Codebook (point-in-time) and DDI Lifecycle (whole data life cycle)
- Granularity: Supports variable-level documentation
- Users: UK Data Service, ICPSR, GESIS, CESSDA network
Common misconceptions
What people often get wrong
Often heard: DDI and Dublin Core are interchangeable.
Actually: No — Dublin Core is a simple, general-purpose set of descriptive elements, while DDI is a rich, domain-specific standard supporting variable-level documentation of survey and microdata. DDI is far more detailed for social-science data.
Often heard: DDI only documents a dataset as a whole.
Actually: No — a key feature of DDI is variable-level documentation, describing individual variables, their question wording, codes, and categories, not just the dataset overall.
Often heard: DDI Codebook and DDI Lifecycle are the same thing.
Actually: No — Codebook is a lighter specification for documenting a dataset largely at one stage, whereas Lifecycle models the whole research data life cycle and supports more complex, reusable documentation.
Going deeper
Related CASRAI guidance
- What is metadata? →
- What is research data management? →
- What is a data repository? →
- What is FAIR data? →
- What is a data management plan? →








