Examples
Worked examples
- Is an instance
A model card reporting '~0.5 gCO2e per 1k-token request at the EU-West-1 inference region'.
- Is an instance
An LLM-serving provider's quarterly carbon-impact report.
Counter-examples
Looks similar, but isn't
- Not an instance
Training-only emissions estimate.
- Not an instance
An undated 'one Google search uses X joules' citation.
Editorial commentary
For deployed frontier models, lifetime inference emissions often exceed training emissions because the same model serves billions of requests over its lifetime. Estimation requires per-request token counts, model active parameter count, hardware efficiency, and grid carbon intensity at the inference site. Reporting practice in this area is still maturing.
References
- Patterson et al., 'Carbon Emissions and Large Neural Network Training' (arXiv 2021); Luccioni, Strubell, 'Power Hungry Processing' (FAccT 2024).
Also known as
model inference CO2 · serving emissions
Machine-readable encodings
Use in your systems
<role vocab="credit"
vocab-identifier="https://casrai.org/dictionary/"
vocab-term="Inference carbon footprint"
vocab-term-identifier="https://casrai.org/dictionary/term/inference-carbon-footprint" />{
"@context": "https://schema.org",
"@type": "DefinedTerm",
"name": "Inference carbon footprint",
"identifier": "https://casrai.org/dictionary/term/inference-carbon-footprint",
"description": "The greenhouse-gas emissions associated with serving inference requests from a deployed model, typically expressed per-request (e.g., gCO2e per query) or in aggregate (kgCO2e per month).",
"inDefinedTermSet": "https://casrai.org/dictionary/domain/ai-and-ml-research-outputs/",
"url": "https://casrai.org/dictionary/term/inference-carbon-footprint",
"sameAs": [
"model inference CO2",
"serving emissions"
],
"license": "https://creativecommons.org/licenses/by/4.0/"
}







