Examples
Worked examples
- Is an instance
Calling the GPT-4 API at temperature=0 to make outputs more reproducible
Counter-examples
Looks similar, but isn't
- Not an instance
Training the model on new data is not inference — it is training (or fine-tuning)
Editorial commentary
Inference-time parameters that affect reproducibility include temperature, top-p, top-k, maximum tokens, random seed, and any system prompt. Two researchers using the ‘same’ model can get different outputs if these inference settings differ, so disclosure should specify them where they materially affect results.
References
- Vaswani et al. 2017 ‘Attention Is All You Need’ NeurIPS
- Holtzman et al. 2020 ‘The Curious Case of Neural Text Degeneration’ ICLR
Also known as
Model inference · LLM inference
Machine-readable encodings
Use in your systems
<role vocab="credit"
vocab-identifier="https://casrai.org/dictionary/"
vocab-term="Inference"
vocab-term-identifier="https://casrai.org/dictionary/term/inference" />{
"@context": "https://schema.org",
"@type": "DefinedTerm",
"name": "Inference",
"identifier": "https://casrai.org/dictionary/term/inference",
"description": "The process of generating outputs from a trained AI model in response to inputs at runtime, distinct from training (which updates model parameters); for LLMs, inference is the production of completions from prompts.",
"inDefinedTermSet": "https://casrai.org/dictionary/domain/generative-ai-use-and-disclosure/",
"url": "https://casrai.org/dictionary/term/inference",
"sameAs": [
"Model inference",
"LLM inference"
],
"license": "https://creativecommons.org/licenses/by/4.0/"
}







