Open In App

Scaling Variables Parallel Coordinates chart in R

Improve
Improve
Like Article
Like
Save
Share
Report

To analyse and visualise high-dimensional data, one can use Parallel Coordinates. A background is drawn consisting of n parallel lines, often vertical and evenly spaced, to display a set of points in an n-dimensional space. A point in n-dimensional space is represented by a polyline with vertices on parallel axes; the ith coordinate of the point corresponds to the position of the vertex on the ith axis.

Scaling Variables Parallel Coordinates chart in R Programming Language

This representation is similar to time series visualization, except that it is used with data that does not have a natural order because the axes do not correlate to points in time. As a result, several axis layouts may be of interest.

Used Module:

  • GGally: It extends ggplot2 by adding several functions to reduce the complexity of combining geoms with transformed data. It can be installed with the following commands:
install.packages("GGally")
  • hrbrthemes: It is a compilation of extra ‘ggplot2’ themes for axis and plot.
install.packages("hrbrthemes")

To plot the Parallel Coordinates we will use ggparcoord() method.

Syntax: ggparcoord( data, columns = 1:ncol(data), groupColumn = NULL, scale = “std”, scaleSummary = “mean”, centerObsID = 1, missing = “exclude”, order = columns, showPoints = FALSE, splineFactor = FALSE, alphaLines = 1, boxplot = FALSE, shadeBox = NULL, mapping = NULL, title = “”)

Parameters:

  • data: Dataset
  • columns:  Vector of variables (either names or indices) to be axes in the plot
  • groupColumn: Single variable to group (color) by
  • scale: Method used to scale the variables (see Details)
  • scaleSummary: if scale==”center”, summary statistic to univariately center each variable by
  • centerObsID: if scale==”centerObs”, row number of case plot should univariately be centered on
  • missing: Method used to handle missing values (see Details)
  • order: Method used to order the axes (see Details)
  • showPoints: logical operator indicating whether points should be plotted or not

Example 1: Without Scaling

Here we will see without using a scaling variable. For this, we will not use scale attributes.

R




# Libraries
library(GGally)
library(viridis)    # provide the color palette
library(hrbrthemes) # provides themes for axis and plot
 
# default data in R
data <- iris
 
# glimpse of the data
head(data)
 
# plotting the Parallel Coordinates
ggparcoord(data, # data
           columns = 1:3, # plotting first 3 columns
           alphaLines = .4, # transparency of the color
           groupColumn = 5, order = "anyClass",
           showPoints = TRUE) +
  theme(
    plot.title = element_text(size=10)
  )


 
 

Output:

 

  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

Example 2: With MinMax Scaling

 

Here we will use mixmax scaling variable with scale = “globalminmax”.

 

R




# Libraries
library(GGally)
library(viridis)    # provide the color palette
library(hrbrthemes) # provides themes for axis and plot
 
# default data in R
data <- iris
 
# glimpse of the data
head(data)
 
# plotting the Parallel Coordinates
ggparcoord(data, # data
           columns = 1:3, # plotting first 3 columns
           alphaLines = .4, # transparency of the color
           groupColumn = 5, order = "anyClass",
           scale = "globalminmax",
           showPoints = TRUE) +
  theme(
    plot.title = element_text(size=10)
  )


Output:

  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

Example 3: Scaling with Standardisation

Here we will use Standardisation scaling variable with scale = “std”.

R




# Libraries
library(GGally)
library(viridis)    # provide the color palette
library(hrbrthemes) # provides themes for axis and plot
 
# default data in R
data <- iris
 
# glimpse of the data
head(data)
 
# plotting the Parallel Coordinates
ggparcoord(data, # data
           columns = 1:3, # plotting first 3 columns
           alphaLines = .4, # transparency of the color
           groupColumn = 5, order = "anyClass",
           scale = "std",
           showPoints = TRUE) +
  theme(
    plot.title = element_text(size=10)
  )


Output:

  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa



Last Updated : 23 Feb, 2022
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads