Open In App

ggplot2 Cheat Sheet

Last Updated : 03 Aug, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Welcome to the ultimate ggplot2 cheat sheet! This is your go-to resource for mastering R’s powerful visualization package. With ggplot2, you can create engaging and informative plots effortlessly. Whether you’re a beginner or an experienced programmer, ggplot2’s popularity and versatility make it an essential skill to have in your R toolkit.

If you are new to ggplot2, this cheat sheet will help you get started. It covers the basics of ggplot2, including how to create a basic plot, add layers, and customize the appearance of your plots.

ggplot2  Cheat Sheet

ggplot2 Cheat Sheet

What is ggplot2?

ggplot2 is the Most Vibrant data visualization package in R Programming Language it is based on the idea of  “Grammar of Graphics” and it is a free, open-source, and easy-to-use visualization package widely used in R.

“Grammer of Graphics”

The idea behind the Grammar of Graphics is that you can construct any graph using three key components: a dataset, a coordinate system, and geoms—visual marks that represent data points. And In ggplot2, this concept is put into practice to facilitate plot creation. if you ever make a graph using ggplot2 then you can relate to it, first, you begin by specifying the data you want to visualize. From there, you can easily add various layers to your plot, such as points or lines, using straightforward functions.

For example, if you want to create a scatter plot of student grades, you can add a layer for the points using the geom_point() function. You can then customize your plot by adding more layers or modifying the plot’s appearance, like changing the colors or labels. This approach allows you to create visually appealing and informative plots in a straightforward and flexible manner.

ggplot2 Cheat Sheet: Data Visualization

Setting up a basic plot using ggplot2 involves a systematic process to create engaging visualizations in R. Let’s explore each step briefly:

Set up the basic plot

In ggplot2 we can efficiently explore and visualize our data, conveying insights and patterns effectively. for that, we have some functions for setting up our plots

Function

Description

ggplot()

Set up the basic plot.
 

Specify the aesthetics

Aesthetics in ggplot2 refer to how variables in our dataset are mapped to the visual properties of the plot. Here are some commonly used aesthetics in ggplot2.

Function

Description

aes()

Define the aesthetics (such as the x- and y-axis, color, and size).

Select a geometry (plot type)

The geometry function is commonly used to create charts, which are effective for comparing categorical variables or displaying frequency distributions. In ggplot2 we have some of the main plot types.

Function

Description

geom_point()

Use to create a Scatter plot.

geom_line()

Use to create a Line plot.

geom_bar()

Use to create a Bar plot.

geom_histogram()

Use to create a Histogram.

geom_boxplot()

Use to create a Boxplot.

geom_area()

Use to create an Area plot.

geom_smooth()

Use to create a Smooth line plot.

geom_violin()

Use to create a violin plot.

geom_tile()

Use to create a heatmap.

ggpairs()

Use to create a Scatterplot Matrix.

Visualization

1. Scatter plot

A scatter plot is a type of data visualization that displays the relationship between two numerical variables.

R




ggplot(data = <data>) +
  aes(x = <x_variable>, y = <y_variable>) +
      geom_point()


2. Line plot

A line chart is a common type of data visualization used to display the trend or change in a variable over time or any ordered sequence.

R




ggplot(data = <data>) +
        aes(x = <x_variable>, y = <y_variable>) +
            geom_line()


3. Bar plot

A bar plot, also known as a bar chart, is a commonly used data visualization that represents categorical data with rectangular bars.

R




ggplot(data = <data>) +
              aes(x = <x_variable>, y = <y_variable>) +
                  geom_bar()


4. Histogram

A histogram is a graphical representation of the distribution of a dataset. It displays the frequency or count of data points falling within specified intervals or bins along an axis.

R




ggplot(data = <data>) +
                    aes(x = <x_variable>) +
                      geom_histogram()


5. Box plot

Statistical visualization that provides a concise summary of the distribution of numerical data.

R




ggplot(data = <data>) +
                        aes(x = <x_variable>, y = <y_variable>) +
                            geom_boxplot()


6. Area plot

Type of data visualization that displays the magnitude and proportion of multiple variables over a continuous axis.

R




ggplot(data = <data>) +
                              aes(x = <x_variable>, y = <y_variable>) +
                                  geom_area()


7. Smooth line plot

Data visualization that represents the trend or pattern of a variable over a continuous axis.

R




ggplot(data = <data>) +
                                    aes(x = <x_variable>, y = <y_variable>) +
                                        geom_smooth()


8. Violin plot

A violin plot is a type of data visualization that combines aspects of a box plot and a kernel density plot.

R




ggplot(data = <data>) +
                                          aes(x = <x_variable>, y = <y_variable>) +
                                              geom_violin()


9. Heatmap

A heatmap is a graphical representation of data where values are displayed as a color matrix.

R




ggplot(data = <data>) +
                                                aes(x = <x_variable>, y = <y_variable>) +
                                                    geom_tile()


10. Scatterplot Matrix

A type of data visualization that allows us to explore the relationships between multiple variables in a dataset.

R




ggpairs(data = <data>) +
                                                      aes(x = <x_variable>, y = <y_variable>)


Geometry

In geometry, there are so many functions available here are some of the main functions.

Function

Description

geom_text()

Text annotations at specified coordinates.

geom_label()

Labeled text annotations with a background and optional border.

geom_rect()

Rectangular shapes are defined by their corner coordinates.

geom_segment()

Straight-line segments are by their start and end coordinates.

geom_polygon()

Filled polygons by a set of coordinates.

geom_ribbon()

The area between two lines is commonly used for confidence intervals.

geom_errorbar()

Vertical or horizontal error bars represent uncertainties or standard errors.

geom_crossbar()

Vertical line segments with a horizontal line representing the range or confidence interval of a variable.

Straight-line

A straight line with a specified slope and intercept.

geom_abline()

Straight line with a specified slope and intercept.

geom_curve()

Create a curved line segment.

geom_density()

Create a density plot to estimate the underlying distribution.

geom_density_2d()

Create a 2D density plot with contours.

geom_dotplot()

Create a dot plot to display the distribution of a variable.

geom_freqpoly()

Create a frequency polygon plot.

geom_jitter()

Add a small amount of random noise to the position of points.

geom_linerange()

Create vertical line segments representing a range of values.

geom_map()

Create a map plot using spatial data.

geom_qq()

Create a quantile-quantile plot.

geom_quantile()

Create a quantile regression line.

geom_raster()

Create a raster plot.

geom_rug()

Add a rug plot to the axes.

Add additional plot layers

In ggplot2 add some additional plot layers to enhance the visualization. we are adding a label to display the value of each plot on top of the chart.

Function

Description

labs()

Set plot title and axis labels.

Themes

In ggplot2 theme function is used to change the theme of the plot. here are some of the common themes.

Function

Description

theme_bw()

Used for the black-and-white theme of the plot.

theme_classic()

Used for the classic theme of the plot.

theme_minimal()

Used for the minimalistic theme of the plot.

theme_void()

Used for the blank theme of the plot.

Scales

Scales in ggplot2 control the mapping between data values and aesthetic properties. Here are some examples of how we can customize scales in ggplot2.

Function

Description

scale_continuous()

Customize the continuous axis scale.

scale_discrete()

Customize the discrete axis scale.

scale_color_continuous()

Customize the color scale for continuous data.

scale_color_gradient()

Customize the color scale using a gradient for continuous data.

scale_color_brewer()

Customize the color scale using predefined color palettes from RColorBrewer.

scale_fill_gradientn()

Customize the color scale using a multi-point gradient for continuous data.

scale_color_viridis_c()

Customize the color scale using the Viridis color palette.

scale_color_hue()

Customize the color scale using a circular hue gradient.

scale_color_identity()

Use the raw data values as color values.

scale_color_grey()

Customize the color scale using shades of grey.

Faceting

Faceting in ggplot2 allows us to create multiple small plots (facets) based on subsets of our data. Each facet represents a different subset of the data and displays a separate plot.

Function

Description

facet_grid()

Create a grid of panels based on the combination of rows and columns specified by the variables.

facet_wrap()

Create a wrapped layout of panels based on a single variable.

facet_grid(rows = vars(), cols = vars(), scales = “fixed”)

Create a grid of panels with fixed scales for each facet.

facet_grid(rows = vars(), cols = vars(), space = “free”)

Create a grid of panels with free scales, allowing each facet to have its own scale.

facet_wrap(~ var, drop = TRUE)

Automatically drop levels with no data for the variable in facet_wrap().

facet_wrap(~ var, drop = FALSE)

Keep all levels of the variable in facet_wrap(), even if there is no data.

facet_wrap(~ var, strip.position = “top”)

Position the facet strip at the top of the panel.

facet_wrap(~ var, strip.position = “bottom”)

Position the facet strip at the bottom of the panel.

facet_wrap(~ var, strip.position = “left”)

Position the facet strip on the left side of the panel.

facet_wrap(~ var, strip.position = “right”)

Position the facet strip on the right side of the panel.

Grouping

In ggplot2 using the group function we will create different groups and visualize the data in different groups.

Function

Description

group

Group the data based on a variable or a combination of variables.

aes(group = variable)

Assign a specific grouping variable within the aes() function to control how observations are grouped.

geom_line()

Connects points in the plot with lines, using the grouping variable specified in aes(group = variable).

geom_path()

Connects points in the plot with lines, without considering the grouping variable specified in aes(group = variable).

geom_smooth()

Fits a smooth line or curve to the data, considering the grouping variable specified in aes(group = variable).

Coordinate System

ggplot2 can produce visualizations that more clearly convey the patterns and relationships in their data by utilizing several coordinate systems.

Function

Description

Cartesian (Default)

Use for the rectangular coordinate system with x and y axes.

Polar

It uses a polar coordinate system with radial and angular axes

Transpose

Flips the x and y axes, switching their roles.

Quick Plot

Automatically selects a coordinate system based on the data.

Map Projection

Projects data onto a 2D map representation.

Calendar

Use the calendar coordinate system, which is useful for time-series data.

Statistical Transformations

In Statistical Transformations, we transform our data using binning, smoothing, descriptive, and intermediate.

Function

Description

stat_identity()

Use for the raw data values without any transformation.

stat_bin()

Calculate the count or frequency of observations in each bin.

stat_sum()

Calculate the sum of values in each group.

stat_mean()

Calculate the mean (average) of values in each group.

stat_median()

Calculate the median of values in each group.

stat_min()

Find the minimum value in each group.

stat_max()

Find the maximum value in each group.

stat_count()

Count the number of observations in each group.

stat_prop()

Calculate the proportion of observations in each group.

stat_summary()

Apply a user-defined summary function to calculate summary statistics for each group.

stat_smooth()

Fit a smooth curve or line to the data using a specified method.

stat_quantile()

Calculate quantiles (e.g., quartiles) of values in each group.

stat_ecdf()

Estimate the empirical cumulative distribution function of values in each group.

stat_ellipse()

Compute and draw ellipses representing multivariate normal distributions.

stat_density()

Estimate the probability density function of a continuous variable.

stat_function()

Plot a mathematical function defined by the user.

stat_summary_bin()

Bin continuous data and calculate summary statistics within each bin.

stat_summary_hex()

Bin two continuous variables into hexagons and calculate summary statistics within each hexagon.

stat_summary2d()

Bin two continuous variables into rectangles and calculate summary statistics within each rectangle.

stat_sf_coordinates()

Extract the coordinates from a spatial object and use them for plotting.

stat_sf()

Plot spatial objects using a specified geom and aesthetics.

Save the plot to a file or display the plot

This function allows us to save the plot as an image file in various formats such as PNG, JPEG, PDF, or SVG. Here’s are some functions for saving the plot as a PNG file.

Function

Description

ggsave()

Save the plot to a file.

print()

Display the plot.

Conclusion

In conclusion, the ggplot2 cheat sheet serves as an invaluable tool for data visualization in R. It provides a comprehensive guide to creating static, aesthetic, and complex plots, which are essential in data analysis and interpretation. The cheat sheet covers key aspects such as aesthetics, geoms, stats, scales, and facets, among others, making it a one-stop resource for both beginners and experienced users.

The ggplot2 package, with its layering concept, offers a high degree of flexibility and control over various plot details. This makes it a preferred choice for many data scientists and statisticians. However, mastering ggplot2 requires understanding its syntax and structure, which the cheat sheet simplifies. Remember, the cheat sheet is not a substitute for hands-on practice. It’s a reference guide to help you navigate the ggplot2 package more efficiently. So, keep exploring, experimenting, and visualizing data with ggplot2, and let the cheat sheet be your companion in this journey.

In the world of data visualization, ggplot2 stands out as a powerful tool, and the cheat sheet is your map to harnessing its full potential. Happy plotting!

ggplot2 Cheat Sheet – FAQS

1. How do I make Ggplot look good?

To make your ggplot look good, you can follow these steps:

  1. Choose the right type of plot: Depending on your data and the kind of information you want to convey, choose the right type of plot. For example, use bar plots for categorical data, scatter plots for continuous data, etc.
  2. Use themes: ggplot2 provides several themes that you can use to make your plots look better. For example, you can use `theme_minimal()`, `theme_classic()`, etc.
  3. Customize your plot: You can customize almost every element of your plot like axes, legend, title, etc. For example, you can use `labs()` function to add or modify the title, labels of axes, and legend.
  4. Use colors wisely: Colors can make your plot more understandable and attractive. You can use color to differentiate between different groups or to represent the value of a variable.

2. What are the three key components required to create a useful Ggplot using the ggplot2 package?

The three key components required to create a useful ggplot using the ggplot2 package are:

  • Data: This is the dataset that you want to visualize. It should be in a data frame.
  • Aesthetics: These are mappings from your data to visual properties of the plot like position, color, size, shape, etc.
  • Layers: These are the actual plot types like bar plot, scatter plot, line plot, etc. You can add multiple layers to a plot.

3. How to call ggplot2?

To call ggplot2, you first need to install and load the package in your R environment. You can do this with the following commands:

R




install.packages("ggplot2") # Install the package
library(ggplot2) # Load the package
```
 After that, you can use the `ggplot()` function to create plots. For example:
```
ggplot(data = df, aes(x = var1, y = var2)) + geom_point()


4. What does the AES () function in ggplot do?

The `aes()` function in ggplot stands for aesthetic mappings. It is used to map variables in your data to visual properties of the plot like position, color, size, shape, etc. For example, in `aes(x = var1, y = var2)`, `var1` is mapped to the x-axis and `var2` is mapped to the y-axis.



Like Article
Suggest improvement
Next
Share your thoughts in the comments

Similar Reads