Skip to main content
v2026.1714 entries · CC-BY 4.0
CASRAI
Dictionary termTrack CStablev2026.2

AI evaluation card

A structured documentation artefact specifically describing an evaluation of an AI system, separate from the model card, including the evaluation methodology, datasets, metrics, results, and known limitations of the evaluation itself.

ByCASRAI Editorial Board
· Last updated 21 May 2026

Examples

Worked examples

  • Is an instance

    An evaluation card for a code-completion benchmark documenting the held-out test set, prompting template, and decoding configuration.

  • Is an instance

    An evaluation card for a clinical-reasoning probe describing rater calibration.

Counter-examples

Looks similar, but isn't

  • Not an instance

    A model card section labelled 'Evaluation' but not separately documented.

  • Not an instance

    A leaderboard table without supporting methodology.

Editorial commentary

Evaluation cards are a more recent (2023-) addition to the documentation-artefact family, recognising that an evaluation can itself become a reusable artefact (a dataset + protocol + analysis approach) deserving its own documentation. NIST's GenAI evaluation profile and the Stanford CRFM evaluation reports exemplify the genre.

References

  • Bommasani et al., Stanford CRFM Foundation Model Transparency Index (2023); NIST AI 600-1 GenAI Profile (2024).

Also known as

eval card

Machine-readable encodings

Use in your systems

JATS XML <role> element
xml
<role vocab="credit"
      vocab-identifier="https://casrai.org/dictionary/"
      vocab-term="AI evaluation card"
      vocab-term-identifier="https://casrai.org/dictionary/term/ai-evaluation-card" />
Schema.org DefinedTerm (JSON-LD)
json
{
  "@context": "https://schema.org",
  "@type": "DefinedTerm",
  "name": "AI evaluation card",
  "identifier": "https://casrai.org/dictionary/term/ai-evaluation-card",
  "description": "A structured documentation artefact specifically describing an evaluation of an AI system, separate from the model card, including the evaluation methodology, datasets, metrics, results, and known limitations of the evaluation itself.",
  "inDefinedTermSet": "https://casrai.org/dictionary/domain/ai-and-ml-research-outputs/",
  "url": "https://casrai.org/dictionary/term/ai-evaluation-card",
  "sameAs": [
    "eval card"
  ],
  "license": "https://creativecommons.org/licenses/by/4.0/"
}
LAC

Partner Deal

LAC Health Supplies Mobile App

Referenced across the research world

University of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logoUniversity of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logo
  • University of Cambridge logo
  • Columbia University logo
  • University of Edinburgh logo
  • Harvard University logo
  • University of Oxford logo
  • Princeton University logo
  • Stanford School of Medicine logo
  • University College London logo
  • ORCID logo
  • Crossref logo

View CASRAI adoption →