Examples
Worked examples
- Is an instance
A frontier-lab internal red team running 6 weeks of structured probes pre-deployment and recording all elicited failures.
- Is an instance
A public red-team event at DEF CON 31 inviting 2,200 participants to probe four frontier models.
Counter-examples
Looks similar, but isn't
- Not an instance
Standard benchmark evaluation.
- Not an instance
User feedback gathered post-deployment.
Editorial commentary
AI red-teaming draws on traditions from information-security and military adversarial testing. It complements automated evaluation by uncovering issues that benchmarks miss. Red-team reports are increasingly disclosed alongside frontier models (OpenAI GPT-4, Anthropic Claude, Meta Llama). DEF CON 31's Generative Red Team event normalised public red-team exercises.
References
- Ganguli et al., 'Red Teaming Language Models' (arXiv 2022); OpenAI 'GPT-4 System Card' (2023).
Also known as
AI red team · adversarial AI testing
Machine-readable encodings
Use in your systems
<role vocab="credit"
vocab-identifier="https://casrai.org/dictionary/"
vocab-term="Red-teaming"
vocab-term-identifier="https://casrai.org/dictionary/term/red-teaming" />{
"@context": "https://schema.org",
"@type": "DefinedTerm",
"name": "Red-teaming",
"identifier": "https://casrai.org/dictionary/term/red-teaming",
"description": "The practice of deliberately adversarial testing of an AI system by skilled testers attempting to elicit failures, unsafe outputs, or policy violations, in order to discover weaknesses before deployment.",
"inDefinedTermSet": "https://casrai.org/dictionary/domain/ai-and-ml-research-outputs/",
"url": "https://casrai.org/dictionary/term/red-teaming",
"sameAs": [
"AI red team",
"adversarial AI testing"
],
"license": "https://creativecommons.org/licenses/by/4.0/"
}







