Open In App

Master Data Visualization With ggplot2

Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we are going to see the master data visualization with ggplot2 in R Programming Language. Generally, data visualization is the pictorial representation of a dataset in a visual format like charts, plots, etc. 

These are the important graphs in data visualization with ggplot2,

Bar chart in R

A bar chart is a representation of the dataset in the format of a rectangular bar. Respectively, its height depends on the values of variables in a dataset. You can use geom_bar() to create bar charts with ggplot2.

R




library(ggplot2)
 
data <- data.frame(
  fruit = c("Apple", "Banana", "Orange", "Mango"),
  quantity = c(300, 450, 280, 800),
  color = c("red", "yellow", "orange", "green")
)
bar_chart <- ggplot(data, aes(x = fruit,
                              y = quantity,
                              fill = fruit,
                              color = fruit)) +
    geom_bar(stat = "identity") +
    labs(title = "Fruit Quantity Chart",
         x = "Fruit", y = "Quantity (in units)") +
    scale_fill_manual(values = data$color) +
    scale_color_manual(values = data$color) +
    theme_minimal()
bar_chart


Output:      

Bar chart -Geeksforgeeks

Bar chart 

Violin Plots in R

Violin plots are similar to box plots. It also shows the probability density at various values. In the violin plots, Denser areas indicate a higher probability at that value. The geom_violin() function is used to create violin plots in ggplot2.

R




library(ggplot2)
 
data <- data.frame(
  category = rep(c("Category 1", "Category 2", "Category 3"), each = 100),
  value = c(rnorm(100, mean = 0, sd = 1),
            rnorm(100, mean = 2, sd = 1.5),
            rnorm(100, mean = -1, sd = 0.5))
)
 
violin_plot <- ggplot(data, aes(x = category, y = value)) +
  geom_violin(fill = "lightblue", color = "black") +
  labs(title = "Violin Plot", x = "Category", y = "Value") +
  theme_minimal()
 
print(violin_plot)


Output:      

violin plots in RGeeksforgeeks

violin plots in R

Density Plots in R

It is used to compute the density estimation. Also, It is a smoothed version of a histogram. The geom_density() function is used to create density plots in ggplot2. 

R




library(ggplot2)
 
# Load mtcars dataset
data(mtcars)
 
# Select the variable for the density plot
data <- mtcars$mpg
 
# Create the density plot
density_plot <- ggplot() +
  geom_density(aes(x = data), fill = "lightblue", color = "black") +
  labs(title = "Density Plot of MPG", x = "MPG", y = "Density") +
  theme_minimal()
 
print(density_plot)


Output:      

Density Plot in RGeeksforgeeks

Density Plot in R

Box Plots in R

It is a visual representation of the spread of data for a variable and displays the range of values along with the median and quartiles. The geom_boxplot() function is used to create box plots in ggplot2. 

R




library(ggplot2)
 
set.seed(20000)
data <- data.frame(A = rpois(900, 3),
                   B = rnorm(900),
                   C = runif(900))
 
# Creating the boxplot using ggplot2
boxplot_plot <- ggplot(data_long, aes(x = variable, y = value)) +
  geom_boxplot() +
  labs(title = "Boxplot of Data values", x = "Variable", y = "Value") +
  theme_minimal()
 
print(boxplot_plot)


Output:

Box plot in RGeeksforgeeks

Box plot in R

Pie Chart in R

It is a circular statistical graphic, which is divided into slices to illustrate numerical proportion. Each slice of the pie chart represents a proportion of the whole. In R, pie charts can be created using the pie function from the ggplot2 library.

R




library(ggplot2)
 
data <- data.frame(categories = c("Mango", "Apple", "Orange"),
                   values = c(520, 358, 405))
pie_chart <- ggplot(data, aes(x = "", y = values, fill = categories)) +
  geom_bar(width = 1, stat = "identity") +
  coord_polar("y", start = 0) +
  labs(title = "Sybudheen Shop", x = "Fruits", y = "total") +
  theme_void()
pie_chart


Output:

Pie Chart -Geeksforgeeks

Pie Chart

Stacked Bar Chart in R

It is a bar chart where the bars are divided into segments to represent the contribution of different categories to the total. In other words, each bar in a stacked bar chart represents the total value of multiple categories. The height of each segment of the bar represents the value of a particular category, and the segments are stacked on top of each other to represent the total value of the bar.

R




# creating data frame
fruits <- c(rep("Apple", 3), rep("Mango", 3),
            rep("Banana", 3), rep("Orange", 3))
quantity <- sample.int(50, 12)
 
Shop <- rep(c('A', 'B', 'C'),4)
 
data <- data.frame(fruits, Shop, quantity)
 
# plotting graph
ggplot(data, aes(fill = Shop, y = quantity, x = fruits))+
geom_bar(position = "stack", stat = "identity")+
ggtitle("Different fruit sells in different shops")+
theme(plot.title = element_text(hjust = 0.5))


Output:

Stacked Bar chart in R - Geeksforgeeks

Stacked Bar chart in R – Geeksforgeeks

Scatter plots in R

In R, a scatter plot is a graphical representation of two sets of data on a Cartesian plane. It displays individual data points as dots, with the x-axis and y-axis representing two variables. The position of each data point on the x-axis and y-axis represents the values of the corresponding x and y variables. Scatter plots can be used to visualize the relationship between two variables and identify trends, patterns, and outliers in the data. To create a scatter plot in R, you can use the plot() or ggplot function.

R




trains <- c(10, 20, 30, 40, 50, 34, 23, 49, 21, 13)
passengers <- c(100, 200, 300, 400, 500,
                229, 346, 432, 198, 235)
 
plot(trains, passengers,
     xlab = "Number of Trains",
     ylab = "Number of Passengers",
     main = "Scatter Plot of Trains vs Passengers")
abline(lm(passengers~trains), col = "red")


Output:

Scatter plot in R - Geeksforgeeks

Scatter Plot

Frequency plots in R

In R, a frequency plot, also known as a histogram, is a graphical representation of the distribution of a set of continuous or discrete data. It shows how frequently each data value or range of values occurs in the data set. The frequency plot is created by dividing the range of the data into equal intervals, called bins, and counting the number of data points that fall into each bin. The height of each bar in the histogram represents the frequency of the corresponding bin. To create a frequency plot in R, you can use the hist function. The hist function takes the data as input and generates the histogram plot.

R




x <- rnorm(2000)
hist(x, main = "Frequency Plot",
     xlab = "Values",
     ylab = "Frequency",
     col = "gray",
     border = "black")


Output:

Frequency Plot in R - Geeksforgeeks

Frequency Plot

In R, ggplot2 to visualize time series data and its components like seasonality, trends, and others.

ggplot2 is a powerful data visualization library in R that allows you to create various types of plots, including time series plots. Time series data refers to data that is collected and recorded over time, such as daily stock prices or monthly sales data. When visualizing time series data in R using ggplot2, you can use various techniques to identify and highlight the different components of the time series, such as trends, seasonality, and residuals. These components are important in understanding the behavior of the time series and making predictions.

R




library(ggplot2)
library(dplyr)
data(AirPassengers)
ts_data <- as.data.frame(AirPassengers)
ts_data$date <- time(AirPassengers)
ggplot(ts_data, aes(x = date, y = AirPassengers)) +
  geom_line() +
  xlab("Year") +
  ylab("Passengers") +
  ggtitle("Air Passengers")
library(forecast)
ts_decomposed <- decompose(AirPassengers)
ts_decomposed_df <- as.data.frame(ts_decomposed)
ggplot(ts_decomposed_df, aes(x = index(ts_decomposed_df$trend),
                             y = ts_decomposed_df$trend)) +
  geom_line() +
  xlab("Year") +
  ylab("Trend") +
  ggtitle("Trend Component")
ggplot(ts_decomposed_df, aes(x = index(ts_decomposed_df$seasonal),
                             y = ts_decomposed_df$seasonal)) +
  geom_line() +
  xlab("Year") +
  ylab("Seasonality") +
  ggtitle("Seasonality Component")
ggplot(ts_decomposed_df, aes(x = index(ts_decomposed_df$random),
                             y = ts_decomposed_df$random)) +
  geom_line() +
  xlab("Year") +
  ylab("Residuals") +
  ggtitle("Residual Component")


Output:

Time Series Data Plot in R - Geeksforgeeks

Time Series Data

  • The AirPassengers dataset from the datasets package is loaded in this section. The data is then transformed into a data frame, and the time() function is used to construct the date column. The date is plotted on the x-axis and the total number of passengers is plotted on the y-axis using the ggplot() function. To depict the time series as a line graph, the geom_line() layer is added. Labelling the x-axis, y-axis, and adding a plot title are done with the ggtitle(), xlab(), and ylab() methods, respectively.
     
  • The forecasting package is loaded here. The time series AirPassengers is broken down into its trend, seasonal, and random components using the decompose() function. Then a data frame is created using the deconstructed parts. With the x-axis representing the index of the trend component and the y-axis indicating the values of the trend, the trend component is drawn using ggplot() and geom_line(). The plot’s title and x- and y-axis labels have been established.
     
  • Similarly, ggplot() and geom_line() are used to plot the seasonality component. The values of the seasonal component are represented on the y-axis, and their index is represented on the x-axis. The plot’s title and x- and y-axis labels have been established.
     

Finally, ggplot() and geom_line() are used to plot the residual component. The index of the residual component is shown by the x-axis, while its values are represented by the y-axis. The plot’s title and x- and y-axis labels have been established.

Save and export R plots:

Saving plots in RGeeksforgeeks

Saving plots in R

click on export on right side .

Saving plots in RGeeksforgeeks

Saving plots in R

Here we can save plots in image and pdf format.



Last Updated : 08 Jun, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads