Data Wrangling in R Programming – Working with Tibbles

Last Updated : 08 Dec, 2023

R is a robust language used by Analysts, Data Scientists, and Business users to perform various tasks such as statistical analysis, visualizations, and developing statistical software in multiple fields.

In R Programming Language Data Wrangling is a process of reimaging the raw data to a more structured format, which will help to get better insights and make better decisions from the data.

What are Tibbles?

Tibbles are the core data structure of the tidyverse and are used to facilitate the display and analysis of information in a tidy format. Tibbles is a new form of Data Frames where data frames are the most common data structures used to store data sets in R.

Advantages of Tibbles over Data Frames

All Tidyverse packages support Tibbles.
Tibbles print in a much cleaner format than data frames.
A data frame often converts character strings to factors and analysts often have to override the setting while Tibbles doesn’t try to make this conversion automatically.

Different ways to create Tibbles

as_tibble(): The first function is as tibble function. This function is used to create a tibble from an existing data frame.

Syntax: as_tibble(x, validate = NULL, …) x is either a data frame, matrix, or list.
tibble(): The second way is to use a tibble() function, which is used to create a tibble from scratch.

Syntax: tibble(s…, rows = NULL) s represents a set of name-value pairs.
Import(): Finally, you can use the tidyverse’s data import packages to create Tibbles from external data sources such as databases or CSV files.

Syntax: import(pkgname …)
library(): The library() function is used to load the namespace of the package.

Syntax: library(package, help, pos = 2, lib.loc = NULL)

Note: To find more about the functions in R, type ? followed by function name. Eg: ?tibble.

Let us see some examples of how to use the above functions using Rstudio IDE. We will be using a builtin dataset (CO2) Carbon Dioxide Uptake in Grass Plants to create a tibble.

Screenshot-(6 — Data Wrangling in R Programming – Working with Tibbles

This dataset consists of several variables, such as plant, type, treatment, concentration, and uptake. It is difficult to work with this type of information, so let us convert this information into a tibble. Let us create a tibble named sample_tibble from CO2 dataset using as_tibble() function.

Example of as_tibble()

Here we are converting a data frame (CO2) into tibble using as_tibble() function. It requires you to install tidyverse package in Rstudio.

# loading tidyverse package    
library(tidyverse)
# creating a tibble named sample_tibble 
sample_tibble <- as_tibble(CO2) 
print(sample_tibble)

Output:

   Plant Type   Treatment   conc uptake
   <ord> <fct>  <fct>      <dbl>  <dbl>
 1 Qn1   Quebec nonchilled    95   16  
 2 Qn1   Quebec nonchilled   175   30.4
 3 Qn1   Quebec nonchilled   250   34.8
 4 Qn1   Quebec nonchilled   350   37.2
 5 Qn1   Quebec nonchilled   500   35.3
 6 Qn1   Quebec nonchilled   675   39.2
 7 Qn1   Quebec nonchilled  1000   39.7
 8 Qn2   Quebec nonchilled    95   13.6
 9 Qn2   Quebec nonchilled   175   27.3
10 Qn2   Quebec nonchilled   250   37.1

The second Method was to create a tibble from scratch using tibble() function so we will create few vectors such as name, marks_in_Math, marks_in_Java, Fav_color etc and pass them to tibble() function which converts them into tibble.

library(tidyverse) 
name <- c("surya", "sai", "Nihith", "prakash", "vikas", "mayur") 
marks_in_Math <- c(91, 85, 92, 89, 90, 93) 
marks_in_Java <- c(89, 91, 88, 91, 89, 87) 
Fav_color <- c("Pink", "Red", "Yellow", "Green", "White", "Blue") 
students <- tibble(name, marks_in_Math, marks_in_Java, Fav_color) 
print(students)

Output:

  name    marks_in_Math marks_in_Java Fav_color
  <chr>           <dbl>         <dbl> <chr>    
1 surya              91            89 Pink     
2 sai                85            91 Red      
3 Nihith             92            88 Yellow   
4 prakash            89            91 Green    
5 vikas              90            89 White    
6 mayur              93            87 Blue

Subsetting tibbles

Data analysts often extract a single variable from a tibble for further use in their analysis, which is called subsetting. When we try to subset a tibble, we extract a single variable from the Tibble in vector form. We can do this by using a few special operators.

$ Operator
[[]] Operator

$ Operator

The first way we can extract a variable from Tibble is by using a dollar($) sign, operator. To do this, we will be creating a tibble from scratch using a tibble() function.

library(tidyverse) 
name <- c("surya", "sai", "Nihith", "prakash", "vikas", "mayur") 
marks_in_Math <- c(91, 90, 91, 85, 90, 92) 
marks_in_Java <- c(91, 91, 92, 91, 89, 93) 
Fav_color <- c("Pink", "Red", "Yellow", "Green", "White", "Blue") 

students <- tibble(name, marks_in_Math, marks_in_Java, Fav_color) 
students$Fav_color 
students$marks_in_Math

Output:

students$Fav_color 
[1] "Pink"   "Red"    "Yellow" "Green"  "White"  "Blue"  
students$marks_in_Math 
[1] 91 90 91 85 90 92

[[]] Operator

The second way you can access a single variable from Tibble is by using square braces([[]]). We will use the same tibble created previously.

library(tidyverse) 
name <- c("surya", "sai", "Nihith", "prakash", "vikas", "mayur") 
marks_in_Math <- c(91, 90, 91, 85, 90, 92) 
marks_in_Java <- c(91, 91, 92, 91, 89, 93) 
Fav_color <- c("Pink", "Red", "Yellow", "Green", "White", "Blue") 

students <- tibble(name, marks_in_Math, marks_in_Java, Fav_color) 
students$Fav_color 
students[["name"]] 
students[["marks_in_Math"]]

Output:

students$Fav_color 
[1] "Pink"   "Red"    "Yellow" "Green"  "White"  "Blue"  
students[["name"]] 
[1] "surya"   "sai"     "Nihith"  "prakash" "vikas"   "mayur"  
students[["marks_in_Math"]] 
[1] 91 90 91 85 90 92

Filtering Tibbles

Filtering provides a way to help reduce the number of rows in your tibble. When performing filtering, we can specify conditions or specific criteria that are used to reduce the number of rows in the dataset.

Syntax: filter(data, conditions)

The data represents the Tibble name, and conditions are used to specify an expression that returns a logical value. We will be using the student’s Tibble, which we created in the above example.

library(tidyverse) 
name <- c("surya", "sai", "Nihith", "prakash", "vikas", "mayur") 
marks_in_Math <- c(91, 90, 91, 85, 90, 92) 
marks_in_Java <- c(91, 91, 92, 91, 89, 93) 
Fav_color <- c("Pink", "Red", "Yellow", "Green", "White", "Blue") 

students <- tibble(name, marks_in_Math, marks_in_Java, Fav_color) 
filter_students<- filter(students,marks_in_Java>=90)
print(filter_students)

Output:

  name    marks_in_Math marks_in_Java Fav_color
  <chr>           <dbl>         <dbl> <chr>    
1 surya              91            91 Pink     
2 sai                90            91 Red      
3 Nihith             91            92 Yellow   
4 prakash            85            91 Green    
5 mayur              92            93 Blue

Converting to Tibble

If we have a traditional data frame and we want to convert it to a tibble, we can use the as_tibble() function to convert into tibble format.

library(tidyverse) 
name <- c("surya", "sai", "Nihith", "prakash", "vikas", "mayur") 
marks_in_Math <- c(91, 90, 91, 85, 90, 92) 
marks_in_Java <- c(91, 91, 92, 91, 89, 93) 
Fav_color <- c("Pink", "Red", "Yellow", "Green", "White", "Blue") 

data<-data.frame(name,marks_in_Math,marks_in_Java,Fav_color)
data
tibble_data<- as_tibble(data)
tibble_data

Output:

     name marks_in_Math marks_in_Java Fav_color
1   surya            91            91      Pink
2     sai            90            91       Red
3  Nihith            91            92    Yellow
4 prakash            85            91     Green
5   vikas            90            89     White
6   mayur            92            93      Blue

 A tibble: 6 × 4
  name    marks_in_Math marks_in_Java Fav_color
  <chr>           <dbl>         <dbl> <chr>    
1 surya              91            91 Pink     
2 sai                90            91 Red      
3 Nihith             91            92 Yellow   
4 prakash            85            91 Green    
5 vikas              90            89 White    
6 mayur              92            93 Blue

Suggest improvement

Data Wrangling in R Programming - Data Transformation

Share your thoughts in the comments

Data Wrangling in R Programming – Working with Tibbles

What are Tibbles?

Advantages of Tibbles over Data Frames

Different ways to create Tibbles

Example of as_tibble()

Subsetting tibbles

$ Operator

[[]] Operator

Filtering Tibbles

Converting to Tibble

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?