Examples
Worked examples
- Is an instance
A model card declaring '70B parameters (dense)'.
- Is an instance
A MoE model card declaring '8x22B = 141B total, ~39B active per token'.
Counter-examples
Looks similar, but isn't
- Not an instance
A tokenizer vocabulary size.
- Not an instance
An embedding-table row count alone.
Editorial commentary
Parameter count is one of the principal axes along which models are compared, alongside training data, training compute, and architecture. For mixture-of-experts models, 'total' and 'active' parameter counts should be reported separately. Parameter count alone does not determine capability, but published practice treats it as headline metadata for comparison.
References
- Kaplan et al., 'Scaling Laws for Neural Language Models' (arXiv 2020); Hoffmann et al., 'Training Compute-Optimal Large Language Models' (NeurIPS 2022).
Also known as
model size (parameter count sense)
Machine-readable encodings
Use in your systems
<role vocab="credit"
vocab-identifier="https://casrai.org/dictionary/"
vocab-term="Parameter count"
vocab-term-identifier="https://casrai.org/dictionary/term/parameter-count" />{
"@context": "https://schema.org",
"@type": "DefinedTerm",
"name": "Parameter count",
"identifier": "https://casrai.org/dictionary/term/parameter-count",
"description": "The total number of learnable scalar weights in a machine-learning model, conventionally reported as a count (e.g., 7B = 7 x 10^9 parameters) and disclosed as a basic model metadata field.",
"inDefinedTermSet": "https://casrai.org/dictionary/domain/ai-and-ml-research-outputs/",
"url": "https://casrai.org/dictionary/term/parameter-count",
"sameAs": [
"model size (parameter count sense)"
],
"license": "https://creativecommons.org/licenses/by/4.0/"
}







