Skip to main content
v2026.1714 entries · CC-BY 4.0
CASRAI

Definition · Plain-language

Metadata Standards

Metadata standards are agreed specifications defining how to describe research data and other digital resources — providing the structured vocabulary that makes content findable, interoperable and reusable by both humans and machines.

CASRAI research-methods explainer — Metadata Standards

The step most authors miss

Doing CRediT right? Don’t stop at the statement.

A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.

Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.

Cross-domain metadata standards

Cross-domain standards provide a shared vocabulary applicable to any type of research output. Dublin Core, maintained by the Dublin Core Metadata Initiative (DCMI), defines 15 elements (title, creator, subject, description, publisher, contributor, date, type, format, identifier, source, language, relation, coverage, rights) that can describe almost any digital resource. DataCite Metadata Schema 4.x is the standard used by research data repositories to enable data citation; it extends beyond Dublin Core with mandatory identifier types (DOI), resource types, funding information and geolocation. Schema.org Dataset vocabulary enables web-crawlable metadata, supporting Google Dataset Search. Metadata encoded in these standards makes research data discoverable through search engines, data repositories and aggregators such as OpenAIRE.

Domain-specific standards

Many scientific disciplines use standards tailored to their data types and community practices. Darwin Core is a community standard for sharing biodiversity observation and specimen data, widely used in natural history and ecology. MIAME (Minimum Information About a Microarray Experiment) defines the minimum metadata required to interpret a microarray study, ensuring reproducibility and enabling comparison across datasets. DICOM (Digital Imaging and Communications in Medicine) is the ubiquitous standard for medical imaging data, specifying both the file format and metadata for radiological images. ABCD (Access to Biological Collection Data) covers natural history collections. ISO 19115 covers geographic information metadata. Using the correct domain standard is often a prerequisite for submitting data to discipline-specific repositories and a requirement for regulatory submissions in life sciences.

Machine-readable metadata and interoperability

The value of metadata standards increases when metadata is expressed in machine-readable formats. RDF (Resource Description Framework) and JSON-LD allow metadata to be linked across systems, enabling the semantic web vision of connected data. RO-Crate (Research Object Crate) provides a lightweight format for packaging research data, software and workflows with their metadata in a single, shareable unit, using schema.org vocabulary. The FAIR data principles (particularly the Interoperable principle) require that metadata use formal, accessible, broadly applicable languages and FAIR vocabularies so that machines can combine data from different sources without human intermediary. The selection of a metadata standard is therefore an architectural decision with downstream consequences for interoperability, discovery and long-term preservation.

Key facts

At a glance

  • Dublin Core: 15-element cross-domain standard for resource description (DCMI)
  • DataCite: metadata schema for citable research datasets; mandatory DOI and resource type
  • schema.org Dataset: web-crawlable metadata enabling Google Dataset Search
  • Darwin Core: community standard for biodiversity and natural history observation data
  • MIAME: minimum information for microarray experiments (reproducibility standard)
  • DICOM: universal standard for medical imaging data file format and metadata
  • RO-Crate: lightweight format packaging data, software and metadata together
  • FAIR requirement: Interoperable principle requires formal shared vocabularies and linked metadata

Common misconceptions

What people often get wrong

Often heard: Any descriptive text attached to a file counts as metadata and satisfies data standards requirements.

Actually: Metadata standards require structured, machine-readable descriptions using agreed vocabularies and identifier types. Free-text descriptions are useful for humans but cannot be reliably interpreted, aggregated or validated by systems, so they do not satisfy the interoperability requirements of FAIR or repository submission guidelines.

Often heard: One universal metadata standard exists that covers all research data.

Actually: No single standard covers all disciplines. Cross-domain standards like Dublin Core provide minimal shared structure; domain-specific standards like Darwin Core, MIAME or DICOM capture the richer disciplinary context needed for genuine reuse. Good data management typically uses both layers.

Often heard: Metadata standards are only relevant for data deposited in repositories.

Actually: Metadata standards matter throughout the data lifecycle: applying them from collection ensures that data is documented consistently, improves internal findability and quality checking, and makes later repository deposit easier. Retrofitting metadata at deposit is costly and often incomplete.

Common questions

FAQ

What is the difference between a metadata standard and a metadata schema?+

The terms are often used interchangeably, but a schema is the formal specification of fields, data types and permitted values, while a standard is a schema that has been formally endorsed by a standards body or community consensus process. DataCite Metadata Schema is both a schema and a community standard; Dublin Core is formalised as ISO 15836.

Do I need to use a metadata standard for my research data?+

If you are depositing data in a repository, the repository will typically require or recommend a specific schema. If your funder requires FAIR data, you should use the community standard for your discipline. Even outside formal requirements, using a standard makes your data more discoverable and easier for others to cite and reuse.

LAC

Partner Deal

LAC Health Supplies Mobile App

Referenced across the research world

University of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logoUniversity of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logo
  • University of Cambridge logo
  • Columbia University logo
  • University of Edinburgh logo
  • Harvard University logo
  • University of Oxford logo
  • Princeton University logo
  • Stanford School of Medicine logo
  • University College London logo
  • ORCID logo
  • Crossref logo

View CASRAI adoption →