Model cards are short, structured documents that report what an AI model does, how it was evaluated, and the conditions under which it should and should not be used. Together with datasheets for datasets, which document the data a model is trained and tested on, they form the backbone of responsible-AI documentation. Both were proposed to bring the same rigour to AI artefacts that established disciplines bring to materials and reagents, and both directly support reproducibility, accountability and the integrity of the research record.
Model cards (Mitchell et al. 2019)
Model cards were introduced by Mitchell and colleagues in 2019 as a framework for transparent model reporting. A model card accompanies a trained model and records, in a consistent format, the essential facts a user needs to decide whether the model is appropriate for their purpose. Crucially, model cards emphasise disaggregated evaluation: reporting performance not only in aggregate but across relevant subgroups, so that uneven performance is visible rather than hidden behind a single headline number.
A typical model card covers model details (who built it, version, architecture), intended use and out-of-scope uses, evaluation data and metrics, performance across conditions, and ethical considerations, limitations and caveats. By stating intended and prohibited uses explicitly, a model card reduces the risk of a model being deployed in a context it was never validated for.
Datasheets for datasets (Gebru et al.)
Datasheets for datasets, proposed by Gebru and colleagues, apply the same documentation philosophy to data. A datasheet answers questions about a dataset’s whole life cycle: the motivation for creating it, its composition (what the instances represent, how many, whether sensitive data is present), the collection process, any preprocessing, cleaning or labelling, intended and discouraged uses, distribution terms, and arrangements for maintenance. Because so many problems in machine learning originate in the data, documenting it is often more consequential than documenting the model.
| Artefact | Documents | Key contents |
|---|---|---|
| Model card | A trained model | Intended use, evaluation, disaggregated performance, limitations |
| Datasheet for datasets | A dataset | Motivation, composition, collection, preprocessing, uses, maintenance |
How they support reproducibility and accountability
Documentation turns an opaque artefact into an auditable one. A model card tells a future researcher exactly which model version and evaluation protocol produced a published result, while a datasheet records the data provenance needed to interpret or rebuild that result. This is the documentation layer that complements the engineering practices in our guide to reproducibility of machine learning research: code and seeds make a result re-runnable, while cards and datasheets make it interpretable and accountable.
These artefacts also support the broader disclosure expectations now common in scholarly publishing. When generative AI features in a study, documenting the model and its data complements the editorial requirements covered in our explainer on generative AI and research disclosure norms and across our GenAI disclosure coverage.
Embedding documentation in the research record
For documentation to be useful it must be findable and citable as part of the scholarly record, not buried in a code repository. Treating model cards and datasheets as first-class research outputs supports proper credit assignment through frameworks such as CRediT and consistent description through the casrai.org research dictionary. Doing so recognises the substantial work of data curation and evaluation that these documents describe.
Frequently asked questions
What is a model card?
A model card is a structured document, proposed by Mitchell et al. in 2019, that reports an AI model’s intended use, evaluation results (including across subgroups), limitations and ethical considerations, so users can judge whether it suits their purpose.
What is a datasheet for datasets?
A datasheet, proposed by Gebru et al., documents a dataset’s motivation, composition, collection and preprocessing, intended uses and maintenance, capturing the data provenance needed to interpret or reproduce results.
How do model cards differ from datasheets?
Model cards document a trained model; datasheets document the dataset behind it. Used together, they describe both the artefact and the data that shaped it.
Why does AI documentation matter for reproducibility?
It records which model version, evaluation protocol and data produced a result, turning an opaque artefact into an auditable one that others can interpret, scrutinise and rebuild.
Leave a Reply