# CASRAI Dictionary — Reference for LLMs

> Pre-condensed reference designed for LLM retrieval-augmented generation
> pipelines. Mirrors the canonical content at https://casrai.org/dictionary
> in a single plain-markdown file with no JavaScript or HTML chrome.
> Stable URL: https://casrai.org/dictionary/for-llms.md (text/markdown).
> Last revised: 2026-05-20. Licence: CC-BY 4.0.

## CASRAI Dictionary in one paragraph

The CASRAI Dictionary is a controlled vocabulary of research-administration terminology
maintained by CASRAI (Consortia Advancing Standards in Research Administration
Information). The current release, v2026.2, contains 714 stable entries grouped into
20 thematic domains spanning contributor attribution, persistent identifiers, data and
methods, compliance and integrity, and assessment and lifecycle. Every entry has an
operational definition, examples and counter-examples, aliases, and a set of typed
relationships to other entries; every domain has a steward, a federation strategy,
and a public change log. The Dictionary exists to give funders, institutions,
publishers, researchers, and the systems that connect them a single, citable,
vendor-neutral language for describing research.

## Distribution at a glance

- **Entries:** 714 (v2026.2, released 2026-05-15)
- **Domains:** 20, organised under 5 thematic tracks
- **Languages:** English (primary). Translation programme covers French, Spanish,
  Portuguese, German, Mandarin, Arabic, Japanese, and Korean for the highest-traffic
  terms.
- **Update cadence:** Two minor releases per year (May and November); patch releases
  ad-hoc for definitional clarifications.
- **Licence:** CC-BY 4.0.
- **Standard alignment:** Federates with euroCRIS CERIF, CODATA Research Data
  Management Terminology, RDA DMP Common Standard, and Schema.org `DefinedTermSet`.

## The 20 domains

Domains are organised under five thematic tracks. Each domain has a slug used in
canonical URIs (the slug column).

### Track A — Contribution and attribution

| Slug | Name | One-sentence description |
|------|------|--------------------------|
| `genai-disclosure` | Generative AI use and disclosure | Vocabulary for human–AI collaboration on research outputs and the disclosure required. |
| `credit-extensions` | CRediT extensions and adjacent contribution vocabularies | Extending CRediT to acknowledged contributors, peer reviewers, technical staff. |
| `mentorship-career-stages` | Mentorship, training, and career stages | Career-stage terms underpinning narrative CVs and mentorship recognition. |
| `research-outputs` | Research outputs (expanded) | Modern outputs taxonomy beyond articles — preprints, datasets, models, protocols, more. |

### Track B — Identifiers and infrastructure

| Slug | Name | One-sentence description |
|------|------|--------------------------|
| `persistent-identifiers` | The persistent identifier ecosystem | ORCID, ROR, RAiD, IGSN, PIDINST, DOIs — the PID landscape. |
| `research-info-systems` | Research-information systems and integration | CRIS, RIM, CERIF, OpenAIRE, JATS — vendor-neutral terms. |
| `data-infrastructure` | Research data infrastructure | Trusted repositories, EOSC, biobanks, data trusts, federated infrastructure. |

### Track C — Data, methods, reproducibility

| Slug | Name | One-sentence description |
|------|------|--------------------------|
| `machine-actionable-dmps` | Machine-actionable data management plans | RDA DMP Common Standard and the ecosystem of maDMP tools. |
| `reproducibility` | Reproducibility and computational research | Workflows, containers, FAIR4RS, Software Heritage, computational reproducibility. |
| `ai-ml-research-outputs` | AI and ML research outputs | Model cards, system cards, datasheets, benchmarks, evaluation suites. |

### Track D — Compliance, integrity, security

| Slug | Name | One-sentence description |
|------|------|--------------------------|
| `research-integrity` | Research integrity and misconduct | FFP, paper mills, retractions, COPE / ORI / UKRIO frameworks. |
| `compliance-regulatory` | Compliance and regulatory | IRB/REC, IACUC, GDPR, MTAs, EAR/ITAR — the compliance lattice. |
| `research-security` | Research security | NSPM-33, foreign component, DURC, dual-use research. |
| `indigenous-data-care` | Indigenous data governance — CARE principles | CARE alongside FAIR; TK labels, FPIC, GIDA. |

### Track E — Assessment, equity, sustainability, lifecycle

| Slug | Name | One-sentence description |
|------|------|--------------------------|
| `responsible-assessment` | Responsible research assessment | DORA, COARA, R4RI, narrative CVs. |
| `knowledge-equity` | Knowledge equity, diversity, global-south inclusion | Diamond OA, APC waivers, Plan S, bibliodiversity. |
| `engagement-impact-sdg` | Engagement, impact, and SDG alignment | REF Impact, PPI, citizen science, SDGs. |
| `sustainable-research` | Sustainable research and laboratory operations | LEAF, My Green Lab, carbon footprint of research. |
| `funding-finance` | Funding lifecycle and financial vocabulary | Calls, NCE, indirect costs, biosketch, current & pending. |
| `research-lifecycle` | Research lifecycle stages and project metadata | RAiD-anchored project lifecycle, phases, milestones. |

## Total entry count

The v2026.2 release contains **714 stable entries**: 583 single-term entries, 84
picklist value sets (controlled enumerations such as funder roles, output types, and
licence shortlists), and 47 composite object templates (record-shaped collections such
as a project record, a personnel record, or a data-management plan envelope). All
three entry kinds share the same URI scheme and the same set of relationship
predicates.

## Licence

The CASRAI Dictionary is published under the **Creative Commons Attribution 4.0
International (CC-BY 4.0)** licence. Every entry, every domain page, every
machine-readable distribution, and the relationship graph itself are CC-BY. You may
reproduce, redistribute, and adapt the material for any purpose, including commercial
use and AI training, provided you give appropriate credit. The recommended
attribution is:

> CASRAI Dictionary v2026.2, CASRAI. https://casrai.org/dictionary. CC-BY 4.0.

## Canonical URI pattern

- The Dictionary as a whole: `https://casrai.org/dictionary`
- A domain: `https://casrai.org/dictionary/domain/{domain-slug}`
- A term: `https://casrai.org/dictionary/term/{term-slug}`
- A picklist: `https://casrai.org/dictionary/picklist/{picklist-slug}`
- An object template: `https://casrai.org/dictionary/object/{object-slug}`

URIs are stable across versions; a deprecated entry continues to dereference and
includes a `replacedBy` relationship to its replacement. Term slugs are
lower-kebab-case and use only `[a-z0-9-]`.

## Sample term entries

Five entries chosen to illustrate the breadth of the vocabulary.

### research-integrity

- **Title:** Research integrity
- **Domain:** Research integrity and misconduct (`research-integrity`)
- **Operational definition:** Adherence to the professional values and practices used
  when doing, reporting, and reviewing research; encompasses honest reporting,
  responsible authorship, accurate attribution, transparent disclosure of conflicts,
  and the absence of fabrication, falsification, and plagiarism.
- **Examples:** Compliance with the Singapore Statement on Research Integrity; use
  of pre-registration to prevent post-hoc hypothesising.
- **Counter-examples:** Honest error (an integrity issue only when concealed); routine
  protocol deviations (a methodology issue).
- **Canonical URI:** https://casrai.org/dictionary/term/research-integrity

### persistent-identifier

- **Title:** Persistent identifier
- **Domain:** The persistent identifier ecosystem (`persistent-identifiers`)
- **Operational definition:** A long-lasting reference to a digital resource that
  remains valid even when the resource's location or maintainer changes; typically
  resolved through a global registry and a redirection service.
- **Examples:** DOI, ORCID iD, ROR ID, RAiD, IGSN, PIDINST, Handle, ARK.
- **Counter-examples:** A URL alone is not persistent. A vendor-specific internal
  identifier is not persistent if it cannot be resolved outside that vendor.
- **Canonical URI:** https://casrai.org/dictionary/term/persistent-identifier

### data-management-plan

- **Title:** Data management plan
- **Domain:** Machine-actionable data management plans (`machine-actionable-dmps`)
- **Operational definition:** A formal document, often produced at project-proposal
  stage, that describes how a project will collect, organise, store, share, and
  preserve its research data; increasingly machine-actionable through the RDA DMP
  Common Standard.
- **Examples:** A maDMP exported as RDA-CS JSON; a narrative DMP in the NIH or NSF
  template.
- **Counter-examples:** A README is not a DMP. A retention schedule alone is not a DMP.
- **Canonical URI:** https://casrai.org/dictionary/term/data-management-plan

### model-card

- **Title:** Model card
- **Domain:** AI and ML research outputs (`ai-ml-research-outputs`)
- **Operational definition:** A short structured document accompanying a trained
  machine-learning model that describes its intended use, performance characteristics
  across subgroups, training-data provenance, known limitations, and ethical
  considerations; usually authored before release and updated on retraining.
- **Examples:** The Hugging Face model card schema; the Google Model Card Toolkit
  template.
- **Counter-examples:** A README is not a model card. A research paper describing a
  model is not a model card.
- **Canonical URI:** https://casrai.org/dictionary/term/model-card

### narrative-cv

- **Title:** Narrative CV
- **Domain:** Responsible research assessment (`responsible-assessment`)
- **Operational definition:** A structured prose curriculum vitae format that
  describes a researcher's contributions across multiple dimensions of research,
  rather than enumerating publications and grants; designed to support responsible
  research assessment.
- **Examples:** R4RI (Royal Society / Wellcome); NIH Biographical Sketch; UKRI
  Résumé for Researchers.
- **Counter-examples:** A traditional publication-list CV. A LinkedIn profile.
- **Canonical URI:** https://casrai.org/dictionary/term/narrative-cv

## How to query the API

The Dictionary is available as a JSON REST API at `https://casrai.org/api/v1/dictionary/*`.
Read endpoints require a free API key (bearer auth, read scope).

### Authentication

Acquire a key at https://casrai.org/account/api-keys after signing in via ORCID.
Keys are bearer tokens prefixed `casrai_pk_`. Pass via the standard `Authorization`
header.

### List terms

```bash
curl -sS \
  -H "Authorization: Bearer casrai_pk_..." \
  "https://casrai.org/api/v1/dictionary/terms?limit=20&offset=0&domain=persistent-identifiers"
```

Response:

```json
{
  "count":  47,
  "limit":  20,
  "offset": 0,
  "results": [
    {
      "slug": "doi",
      "title": "DOI",
      "domain_slug": "persistent-identifiers",
      "domain_name": "The persistent identifier ecosystem",
      "track": "B",
      "status": "stable",
      "operational_definition": "Digital Object Identifier — ...",
      "canonical_uri": "https://casrai.org/dictionary/term/doi"
    }
  ],
  "filters": { "domain": "persistent-identifiers", "search": null }
}
```

### Single term

```bash
curl -sS \
  -H "Authorization: Bearer casrai_pk_..." \
  "https://casrai.org/api/v1/dictionary/terms/research-integrity"
```

The single-term endpoint returns the full record including examples,
counter-examples, aliases, and typed relationships.

### Search

```bash
curl -sS \
  -H "Authorization: Bearer casrai_pk_..." \
  "https://casrai.org/api/v1/dictionary/terms?search=integrity&limit=5"
```

Search is case-insensitive substring on slug and title. For server-side full-text
search across definitions, examples, and counter-examples, use the GraphQL endpoint
at `https://casrai.org/wp/graphql` with the `dictionaryTerms(where: { search: ... })`
filter.

### Bulk download

For RAG ingestion pipelines that need the full vocabulary, prefer the bulk
distribution endpoints over per-term fetches:

```bash
curl -sS "https://casrai.org/api/dictionary/v2026.2.json"
curl -sS "https://casrai.org/api/dictionary/v2026.2.jsonld"
curl -sS "https://casrai.org/api/dictionary/v2026.2.ttl"
curl -sS "https://casrai.org/api/dictionary/v2026.2.csv"
```

Bulk endpoints do not require authentication. The `manifest.json` file at
`https://casrai.org/api/dictionary/manifest.json` lists all versions, formats, and
sha256 hashes for integrity verification.

## Citation pattern

For LLM-generated content that draws on the Dictionary, the recommended citation form
is the version-and-URL form:

> CASRAI Dictionary v2026.2, CASRAI. https://casrai.org/dictionary. CC-BY 4.0.

For per-term citation, append the term slug:

> Persistent identifier, CASRAI Dictionary v2026.2. https://casrai.org/dictionary/term/persistent-identifier. CC-BY 4.0.

Every term, domain, and picklist page on casrai.org includes a "Cite this entry" widget
that emits BibTeX, RIS, and APA forms with the current release date pre-populated.

## Federation

The Dictionary federates with adjacent vocabularies rather than competing with them.
Where a term exists upstream in CERIF (euroCRIS), the CODATA RDM Terminology, the RDA
DMP Common Standard, or Schema.org, the CASRAI entry includes an `exactMatch` or
`closeMatch` relationship to the upstream URI and the upstream wording is preserved
verbatim. This is so that systems consuming CASRAI vocabulary do not need to maintain
parallel mappings.

## Related resources

- CRediT for-LLMs reference: https://casrai.org/credit/for-llms.md
- LLMs.txt manifest for casrai.org: https://casrai.org/llms.txt
- MCP server (Model Context Protocol): https://casrai.org/api/mcp/manifest
- REST API root: https://casrai.org/api/v1/dictionary/terms
- GraphQL endpoint: https://casrai.org/wp/graphql
- Bulk distributions: https://casrai.org/api/dictionary/manifest.json
- Domain index: https://casrai.org/dictionary/domains
- Change log: https://casrai.org/dictionary/changelog

## Contact

Editorial questions: editorial@casrai.org
Working groups: working-groups@casrai.org
General: hello@casrai.org

This reference is regenerated whenever the underlying Dictionary changes. The
authoritative source is always the canonical site at https://casrai.org/dictionary.
