Skip to main content
v2026.1714 entries · CC-BY 4.0
CASRAI

Definition · Plain-language

ggplot2

ggplot2 is a data visualisation package for the R programming language, constructed around a formal framework called the Grammar of Graphics.

CASRAI research-methods explainer — ggplot2

The step most authors miss

Doing CRediT right? Don’t stop at the statement.

A CRediT statement credits you inside one paper. The recognition CRediT was built for happens when those roles are tied to you, persistently. Sign in with your ORCID — free — and claim your CRediT contributions on casrai.org, the home of the standard. They become a verified, portable part of your identity, not a line that disappears into one PDF.

Free: claim your contributions, then export a journal-ready CRediT statement, schema.org structured data, JATS XML, CSV or BibTeX — and preview your public profile. A membership publishes that profile publicly and verifies the journals you serve.

The grammar of graphics framework

Developed by Hadley Wickham, ggplot2 is a data visualisation package for the R programming language based on Leland Wilkinson's Grammar of Graphics. Rather than providing templates for specific charts, ggplot2 treats data visualisations as structured languages built from independent components. A plot is defined by mapping variables in a dataset to visual properties (aesthetics), such as position, colour, shape, and size, which are then represented by geometric objects (geoms). This declarative framework makes it easy to switch or combine visual representations without restructuring the code, offering researchers a cohesive system for data exploration. This systematic approach to plotting allows researchers to design novel, custom charts that represent complex multidimensional relationships far more clearly than traditional pre-built templates.

The layered approach to plotting

Plots in ggplot2 are built additively, using the plus operator (+) to stack visual layers. The base layer is initialized with the ggplot() function, specifying the active data frame and global aesthetic mappings. Subsequent layers are added to display data points (using geom_point() for scatterplots, geom_boxplot() for boxplots), draw statistical trends (geom_smooth()), scale axes, and adjust colour palettes. Because these layers are independent, researchers can build complex, multi-layered visualisations step-by-step, maintaining clean code organisation. This modular syntax allows users to easily add confidence intervals or density overlays to standard plots. This step-by-step assembly makes debugging simpler and allows researchers to modify visual components without affecting the underlying data mapping.

Faceting and theme customisation

Beyond basic plotting, ggplot2 excels at sub-plotting and styling for academic publishing. The faceting system allows users to split a dataset by categorical variables and display subsets in a grid of side-by-side plots (using facet_wrap() or facet_grid()). The package also separates content from styling through its theme engine. Users can apply pre-built themes like theme_minimal() or write custom theme elements to configure fonts, margins, grid lines, and background colours. This level of customisation ensures that figures meet the strict formatting guidelines of journals, outputting scalable vector graphics like PDF and SVG. These high-quality vector outputs can be scaled for print without loss of resolution, ensuring publication standards are consistently met.

Key facts

At a glance

  • Definition: an R package for data plotting based on the Grammar of Graphics
  • Creator: created by Hadley Wickham and maintained as part of the tidyverse
  • Syntax: uses the plus operator (+) to add independent layers to a plot
  • Aesthetics: maps data columns to visual properties (x, y, colour, shape, size) using aes()
  • Geoms: uses geometric functions (e.g., geom_point, geom_line) to draw the shapes
  • Faceting: enables easy creation of small multiples (multi-panel charts) based on groups

Common misconceptions

What people often get wrong

Often heard: ggplot2 is just a collection of different templates for standard charts.

Actually: It is a systematic language. Instead of picking a template, you describe the mappings between your data and visual properties, allowing you to create unique, custom charts.

Often heard: You must manipulate your raw data values within the ggplot function to change colours.

Actually: ggplot2 separates data mapping from scale modification. You map a variable to colour inside aes(), and then use scale functions to customise the exact palette.

Common questions

FAQ

Why does ggplot2 use the plus symbol (+) instead of the pipe operator (%>%)?+

The plus operator is used because ggplot2 was written before the pipe operator (%>%) was widely adopted in R. Hadley Wickham has noted that if he were to rewrite ggplot2 from scratch today, he would likely design it to use the pipe operator.

What are aesthetic mappings in ggplot2?+

Aesthetic mappings, defined using the aes() function, tell ggplot2 how variables in your dataframe should map to visual elements of the plot. For example, mapping a 'weight' column to the x-axis, 'height' to the y-axis, and 'gender' to the colour of the points.

Referenced across the research world

University of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logoUniversity of Cambridge logoColumbia University logoUniversity of Edinburgh logoHarvard University logoUniversity of Oxford logoPrinceton University logoStanford School of Medicine logoUniversity College London logoORCID logoCrossref logo
  • University of Cambridge logo
  • Columbia University logo
  • University of Edinburgh logo
  • Harvard University logo
  • University of Oxford logo
  • Princeton University logo
  • Stanford School of Medicine logo
  • University College London logo
  • ORCID logo
  • Crossref logo

View CASRAI adoption →