The Beyond the Imitation Game benchmark, a community-contributed collection of more than 200 tasks designed to probe capabilities of large language models that may be missed by narrower benchmarks.

ByCASRAI Editorial Board

· Last updated 21 May 2026

Examples

Worked examples

Is an instance
A model technical report including BIG-bench Hard average accuracy across 23 tasks.
Is an instance
A new benchmark paper using BIG-bench as a baseline distribution of LLM capability.

Counter-examples

Looks similar, but isn't

Not an instance
MMLU (a different benchmark).
Not an instance
A single dataset like SQuAD.

Editorial commentary

BIG-bench (Srivastava et al., 2023) emphasises task diversity: arithmetic, logic, multilingual translation, theory-of-mind, common-sense, code, and intentionally adversarial probes. The 'BIG-bench Hard' subset captures the hardest tasks where models showed substantial headroom; it has been heavily reused as a comparison set.

References

Srivastava et al., 'Beyond the Imitation Game' (Transactions on Machine Learning Research, 2023).

Also known as

Beyond the Imitation Game Benchmark · BBH (subset)

Machine-readable encodings

Use in your systems

JATS XML <role> element

xml

<role vocab="credit"
      vocab-identifier="https://casrai.org/dictionary/"
      vocab-term="BIG-bench"
      vocab-term-identifier="https://casrai.org/dictionary/term/big-bench" />

Schema.org DefinedTerm (JSON-LD)

json

{
  "@context": "https://schema.org",
  "@type": "DefinedTerm",
  "@id": "https://casrai.org/dictionary/term/big-bench",
  "name": "BIG-bench",
  "identifier": "https://casrai.org/dictionary/term/big-bench",
  "description": "The Beyond the Imitation Game benchmark, a community-contributed collection of more than 200 tasks designed to probe capabilities of large language models that may be missed by narrower benchmarks.",
  "inDefinedTermSet": "https://casrai.org/dictionary/domain/ai-ml-research-outputs#set",
  "url": "https://casrai.org/dictionary/term/big-bench",
  "sameAs": [
    "Beyond the Imitation Game Benchmark",
    "BBH (subset)"
  ],
  "license": "https://creativecommons.org/licenses/by/4.0/",
  "publisher": {
    "@id": "https://casrai.org/#organization"
  },
  "dateModified": "2026-05-21T02:22:51",
  "inLanguage": "en"
}

Referenced across the research world

View CASRAI adoption →