Open In App

How to randomly shuffle contents of a single column in R dataframe?

Last Updated : 23 Aug, 2021
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will learn how can we randomly shuffle the contents of a single column using R programming language.

Sample dataframe in use:

c1 c2 c3
a1 w1 1a
b2 x2 2b
c3 y3 3c
d4 z4 4d

Method1: Using sample()

In this approach we have used the transform function to modify our dataframe, then we have passed the column name which we want to modify, then we provide the function according to which we want to modify the dataframe column. 

In the given example, we are passing the c2 column of our dataframe in sample() function, this function shuffles the c2 column, and then we re-assign it to c2 column, by doing: c2=sample(c2)

Syntax: transform( df, column_name = sample(column_name))

Parameters:

df: Dataframe object

column_name: column to be shuffled

sample(): shuffles the dataframe column

transform() function is used to modify data. It converts the first argument to the data frame. This function is used to transform/modify the data frame in a quick and easy way.

Example: R program to randomly shuffle contents of a column

R




df <- data.frame(c1=c("a1", "b2", "c3", "d4"), c2=c("w1", "x2", "y3", "z4"), c3=c("1a", "2b", "3c", "4d")) df_shuffled=transform( df, c2 = sample(c2)) df_shuffled


Output:

  c1 c2 c3
1 a1 y3 1a
2 b2 w1 2b
3 c3 x2 3c
4 d4 z4 4d

Method 2: Without using transform()

The columns of the old dataframe are passed here in order to create a new dataframe. In the process, we have used sample() function on column c3 here, due to this the new dataframe created has shuffled values of column c3. This process can be used for randomly shuffling multiple columns of the dataframe.

Syntax:

data.frame(c1=df$c1, c2=df$c2, c3=sample(df$c2))

Example: R program to randomly shuffle contents of a column

R




df <- data.frame(c1=c("a1", "b2", "c3", "d4"), c2=c("w1", "x2", "y3", "z4"), c3=c("1a", "2b", "3c", "4d")) df_shuffled=data.frame(c1=df$c1, c2=df$c2, c3=sample(df$c2)) df_shuffled


Output:

  c1 c2 c3
1 a1 w1 w1
2 b2 x2 y3
3 c3 y3 z4
4 d4 z4 x2

Method 3: Randomly shuffling Multiple columns

This approach is almost similar to the previous approach. The only difference here is we are using sample() function on multiple columns, this randomly shuffles those columns. We have called the sample function on columns c2 and c3, due to these columns, c2 and c3 are shuffled.

Syntax

data.frame(c1=df$c1, c2=sample(df$c2), c3=sample(df$c2))

Example: R program to randomly shuffle contents of a column

R




df <- data.frame(c1=c("a1", "b2", "c3", "d4"), c2=c("w1", "x2", "y3", "z4"), c3=c("1a", "2b", "3c", "4d")) df_shuffled=data.frame(c1=df$c1, c2=sample(df$c2), c3=sample(df$c2)) df_shuffled


Output:

  c1 c2 c3
1 a1 w1 x2
2 b2 y3 z4
3 c3 x2 w1
4 d4 z4 y3


Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads