Data science & AI · Reference
What is generative AI?
Generative AI refers to artificial-intelligence systems that produce new content — such as text, images, audio, video, or code — by learning the patterns of their training data and sampling from a learned distribution.
Generative versus discriminative models
Machine-learning models fall into two broad kinds. A discriminative model learns to tell categories apart — for instance, whether an image shows a cat or a dog. A generative model instead learns the underlying distribution of the data well enough to produce new examples — a plausible new cat image, or a new sentence. Generative AI is the applied use of generative models to create content, sampling from the learned distribution to produce outputs that resemble, but are not copies of, the training data.
How generative models work
Different media use different generative architectures. Large language models generate text by predicting tokens in sequence; generative adversarial networks and, more recently, diffusion models generate images by learning to reverse a noising process.
Many systems are foundation models: large models pre-trained on broad data and then adapted to many downstream tasks. This reuse is a defining feature of the current generation of generative AI.
Capabilities and limitations
Generative AI can draft text, synthesise images, write code, and produce audio, often from a natural-language prompt. Its limitations mirror those of the underlying models: outputs are probabilistic and may be inaccurate or fabricated, can reflect biases in training data, and raise questions about provenance and the rights to training content. Outputs are not authoritative and, in research contexts, must be verified before use.
Generative AI in research
In research, generative models are studied for their capabilities and risks and used cautiously as tools — for example to draft text or generate synthetic data. Synthetic data can help where real data is scarce or sensitive, but it must be validated so that it does not introduce artefacts. Reproducibility is challenging because outputs are stochastic and models change; documenting the model, version, prompt, and settings is essential, as is disclosing where generative AI was used.
Key facts
At a glance
- Definition: AI that generates new content from learned distributions
- Contrast: generative vs discriminative models
- Text: built on large language models (transformers)
- Images: GANs and diffusion models
- Often built as: large, reusable foundation models
- Key caveat: outputs are probabilistic, not authoritative
Common questions
FAQ
What is the difference between generative and discriminative AI?+
A discriminative model distinguishes between categories (for example, cat versus dog). A generative model learns the data distribution well enough to create new examples. Generative AI applies generative models to produce new content.
What kinds of content can generative AI create?+
Generative AI can produce text, images, audio, video, and code. Different model types specialise in different media — for example language models for text and diffusion models for images.
What is a foundation model?+
A foundation model is a large model pre-trained on broad data that can be adapted to many downstream tasks. Many generative-AI systems are built on foundation models rather than trained from scratch for each use.
The step most authors miss
Doing CRediT right? Don’t stop at the statement.
A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.
Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.







