Open In App

Purrr Package in R Programming

Last Updated : 01 Aug, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Purrr is a popular R Programming package that provides a consistent and powerful set of tools for working with functions and vectors. It was developed by Hadley Wickham and is part of the tidyverse suite of packages. Purrr is an essential package for functional programming in R.

Purrr provides a set of functions that are designed to work with functional programming concepts, such as mapping, filtering, and reducing. These functions are designed to work with lists, data frames, and other objects, making it easier to work with complex data structures.

The main functions provided by purrr are map(), walk(), reduce(), accumulate(), and compose() etc. These functions can be used for a variety of tasks, such as applying a function to each element of a list, filtering a list based on a condition, and reducing a list to a single value.

 functions in Purrr Package:

Function

Description

pluck() By passing the indices of the element to be extracted, this method extracts a single element from a nested list.
map_dbl() This function is comparable to map() but instead of returning a list, it produces a numeric vector. It does this by applying a specified function to each element of a list and then returning a vector with the same length as the input list that contains the outcomes of doing so.
map_df() When a data frame with the same number of rows as the input list and columns holding the results of applying the function to each element is returned, it has applied the specified function to each element of the input list.
map_chr() It does this by applying a specified function to each element of a list and then returning a vector with the same length as the input list that contains the outcomes of doing so.
pmap() Each input list’s corresponding components are subject to the application of a specified function, and a list with the same length as the input lists is returned.
transpose() The first element of each sub-list becomes the first element of the output list when using this function to transform a list of lists. The second element of each sub-list becomes the second element of the output list, and so on. Working with material that has been organised as a list of lists can benefit from this.
keep() This function uses a predicate function to filter a list’s components. Only the components for which the predicate function returns TRUE are included in the new list that is returned.
accumulate() Similar to reduce(), this function gives a list of accumulated values as opposed to a single accumulated value.
some() This function determines whether a list’s minimum number of items satisfy a specified predicate function. If at least one element matches the predicate function, it returns TRUE; otherwise, it returns FALSE.

Installation:

To install the Purrr package, we can use the following code:

R




install.packages("purrr")


Once the package is installed, we can load it into our R environment using the library() function:

R




library(purrr)


Simplifying iteration:

Iteration is another core concept in functional programming. It involves applying a function to a set of inputs, one at a time, and returning a set of outputs. The Purrr package provides a more efficient way to perform iteration using the map() and walk() functions. 

1. Iteration using map( ) function:

The map() function is used to apply a function to each element of a list or a vector. It takes a function and a list or a vector as inputs and returns a new list or vector where the function has been applied to each element of the original list or vector. The output type will be the same as the input type. The parameters for the map() function are as follows:

Syntax: 

Parameters:  `.f`

object: the function to apply to each element of the list or vector

Example 1:

R




# Create a vector of numbers
numbers <- c(1, 2, 3, 4, 5)
 
# Apply function to each element
map(numbers, function(x) x^2)


This will return a new list with the squared values of each element in the original vector.

Output:

> map(numbers, function(x) x^2)
[[1]]
[1] 1

[[2]]
[1] 4

[[3]]
[1] 9

[[4]]
[1] 16

[[5]]
[1] 25

Example 2:

R




list_of_dfs <- list(
  data.frame(a = c(1, 2, 3), b = c(4, 5, 6)),
  data.frame(a = c(7, 8, 9), b = c(10, 11, 12)),
  data.frame(a = c(13, 14, 15), b = c(16, 17, 18))
)
 
map(list_of_dfs, function(df) summary(df))


This will return a new list with the summary statistics for each data frame in the original list.

Output:

> map(list_of_dfs, function(df) summary(df))
[[1]]
       a             b      
 Min.   :1.0   Min.   :4.0  
 1st Qu.:1.5   1st Qu.:4.5  
 Median :2.0   Median:5.0  
 Mean   :2.0   Mean   :5.0  
 3rd Qu.:2.5   3rd Qu.:5.5  
 Max.   :3.0   Max.   :6.0  

[[2]]
       a             b       
 Min.   :7.0   Min.   :10.0  
 1st Qu.:7.5   1st Qu.:10.5  
 Median :8.0   Median :11.0  
 Mean   :8.0   Mean   :11.0  
 3rd Qu.:8.5   3rd Qu.:11.5  
 Max.   :9.0   Max.   :12.0  

[[3]]
       a              b       
 Min.   :13.0   Min.   :16.0  
 1st Qu.:13.5   1st Qu.:16.5  
 Median :14.0   Median :17.0  
 Mean   :14.0   Mean   :17.0  
 3rd Qu.:14.5   3rd Qu.:17.5  
 Max.   :15.0   Max.   :18.0  

2. Iteration using walk( ) function:

The walk() function is used to apply a function to each element of a list or a vector, but it does not return anything. Instead, it is used when we want to perform an operation on each element of a list or a vector, such as printing or saving to a file. 

The walk() function is similar to map(), but it does not return a new list or vector. It is particularly useful when we need to apply a function to each element of a list or a vector and perform some side-effect for each element. The parameters for the walk() function are as follows:

Syntax: 

Parameters:  `…`

object: additional arguments to pass to the function. These are optional.

R




library(purrr)
 
# create a user-defined dataset
df <- data.frame(
  x = c(1, 2, 3, 4),
  y = c(5, 6, 7, 8)
)
 
# use the walk() function to apply a function to each element in the y column
walk(df$y, ~ print(.x * 3))


Output:

> walk(df$y, ~ print(.x * 3))
[1] 15
[1] 18
[1] 21
[1] 24

Functional Programming Tools:

Purrr provides a set of functional programming tools, such as reduce(), accumulate(), compose(), partial(), etc., that make it easier to work with functions.

1. reduce( ) function:

The reduce() function from the purrr package in R is used to successively apply a binary function to the elements of a vector, list, or data frame, and returns a single value. Here are the parameters for the reduce() function:Here’s an example using the built-in iris dataset to calculate the sum of the petal lengths for all iris flowers with a sepal length greater than 5:

R




library(purrr)
library(dplyr)
data("iris")
 
# filter the iris dataset to include only flowers with a sepal length > 5
long_sepals <- filter(iris, "Sepal.Length" > 5)
 
# use reduce to calculate the sum of the petal lengths for all long sepal flowers
long_sepals <- filter(iris, Sepal.Length > 5)
petal_sum <- sum(long_sepals$Petal.Length)
 
petal_sum


Output:

> petal_sum
[1] 509.2

2. accumulate( ) function:

The accumulate() function from the purrr package is a powerful tool for performing iterative calculations on a vector or list. The function allows you to apply a function iteratively to each element of a vector or list, accumulating the results at each step. Parameters of the accumulate() function:Here’s an example of how you could use the accumulate() function from the purrr package to calculate the cumulative sum of the Sepal.Length column in the built-in iris dataset:

Syntax: 

Parameters:  `.f`

  • object: binary function to be applied successively to the elements of x. The first argument to the function should be the accumulated value, and the second argument should be the current element of x. If .init is not specified, the first element of x will be used as the initial accumulated value.

Syntax: 

Parameters:  init

  • object: An optional initial value for the accumulated value. If not specified, the first element of x will be used as the initial accumulated value.

R




library(purrr)
 
# Use accumulate() to calculate the cumulative sum of the Sepal.Length column in iris
iris_sum <- iris %>%
  pull(Sepal.Length) %>%
  accumulate(`+`)
 
# Output the resulting cumulative sums
print(iris_sum)


Output:

> print(iris_sum)
  [1]   5.1  10.0  14.7  19.3  24.3  29.7  34.3  39.3  43.7  48.6  54.0  58.8  63.6
 [14]  67.9  73.7  79.4  84.8  89.9  95.6 100.7 106.1 111.2 115.8 120.9 125.7 130.7
 [27] 135.7 140.9 146.1 150.8 155.6 161.0 166.2 171.7 176.6 181.6 187.1 192.0 196.4
 [40] 201.5 206.5 211.0 215.4 220.4 225.5 230.3 235.4 240.0 245.3 250.3 257.3 263.7
 [53] 270.6 276.1 282.6 288.3 294.6 299.5 306.1 311.3 316.3 322.2 328.2 334.3 339.9
 [66] 346.6 352.2 358.0 364.2 369.8 375.7 381.8 388.1 394.2 400.6 407.2 414.0 420.7
 [79] 426.7 432.4 437.9 443.4 449.2 455.2 460.6 466.6 473.3 479.6 485.2 490.7 496.2
 [92] 502.3 508.1 513.1 518.7 524.4 530.1 536.3 541.4 547.1 553.4 559.2 566.3 572.6
[105] 579.1 586.7 591.6 598.9 605.6 612.8 619.3 625.7 632.5 638.2 644.0 650.4 656.9
[118] 664.6 672.3 678.3 685.2 690.8 698.5 704.8 711.5 718.7 724.9 731.0 737.4 744.6
[131] 752.0 759.9 766.3 772.6 778.7 786.4 792.7 799.1 805.1 812.0 818.7 825.6 831.4
[144] 838.2 844.9 851.6 857.9 864.4 870.6 876.5

3. compose( ) function:

compose() is a function in the purrr package that allows you to combine multiple functions into a single composite function. The resulting function applies each of the input functions in turn, with the output of one function being used as the input to the next function. The parameters for compose() are as follows:Here’s an example of how to use compose() with an inbuilt dataset in R, specifically the mtcars dataset:

Syntax: 

Parameters: `.f`

  • object: A list of functions to be composed, in the order they should be applied.

Syntax: 

Parameters: `…`

  • object: Additional arguments to be passed to each function in .f. These should be specified as named arguments, with the argument name corresponding to the function name (e.g. foo = 1 would pass 1 as the foo argument to the function).

R




library(purrr)
 
# Create a composite function that squares a number and then takes the square root
my_func <- compose(sqrt, function(x) x^2)
 
# Apply the composite function to the mpg column of the mtcars dataset
map_dbl(mtcars$mpg, my_func)


Output:

> map_dbl(mtcars$mpg, my_func)
 [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4 10.4
[17] 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7 15.0 21.4

In this example, we first create a composite function my_func using compose(). This function first squares its input, and then takes the square root of the result. We then use the map_dbl() function from purrr to apply my_func to the mpg column of the mtcars dataset, which returns a vector of the square roots of the squared mpg values.

4. partial( ) function:

The partial() function in the purrr package allows you to create a new function by fixing one or more of the arguments of an existing function. The resulting function can then be used with the remaining unfixed arguments. The parameters for partial() are as follows:Syntax: 

Parameters: `.f`

  • object:The function to be partially evaluated.

Syntax: 

Parameters: `…`

  • object:One or more named arguments to fix to specific values. The argument names should correspond to the argument names of .f, and the values should be the desired fixed values.

Example:

R




library(purrr)
 
# Create a toy data frame with weight and horsepower variables
my_cars <- data.frame(
  weight = c(1000, 2000, 3000),
  horsepower = c(100, 200, 300)
)
 
# Create a partial function that calculates miles per gallon
mpg_func <- partial(function(df, wt, hp) {
  df$mpg <- wt/hp
  df
}, wt = my_cars$weight)
 
# Apply the partial function to each row of the my_cars dataset
map(my_cars, mpg_func, hp = 150)


Output:

> map(my_cars, mpg_func, hp = 150)
$weight
$weight[[1]]
[1] 1000

$weight[[2]]
[1] 2000

$weight[[3]]
[1] 3000

$weight$mpg
[1]  6.666667 13.333333 20.000000


$horsepower
$horsepower[[1]]
[1] 100

$horsepower[[2]]
[1] 200

$horsepower[[3]]
[1] 300

$horsepower$mpg
[1]  6.666667 13.333333 20.000000

Conclusion

The Purrr package is a powerful tool for functional programming in R. It provides a streamlined and efficient way to work with functions and data structures, and can simplify the code and make it more readable. By understanding the concepts and functions in the Purrr package, R programmers can take their skills to the next level and write more efficient and effective code.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads