Open In App

Working with CSV files in R Programming

Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will discuss how to work with CSV files in R Programming Language.

R CSV Files

R CSV Files are text files wherein the values of each row are separated by a delimiter, as in a comma or a tab. In this article, we will use the following sample CSV file.

Getting and Setting the Working Directory with R CSV Files

R




# Get the current working directory.
print(getwd())
 
# Set current working directory.
setwd("/web/com")
 
# Get and print current working directory.
print(getwd())


Output:

[1] "C:/Users/GFG19565/Documents"

[1] "C:/Users/GFG19565/Documents"

With the help of getwd() function we can get the current working directory and with the help of setwd()function we can also set a new working directory.

Input as R CSV Files

id, name, department, salary, projects
1, A, IT, 60754, 4
2, B, Tech, 59640, 2
3, C, Marketing, 69040, 8
4, D, Marketing, 65043, 5
5, E, Tech, 59943, 2
6, F, IT, 65000, 5
7, G, HR, 69000, 7

We can save this file in notepad and give name sample.csv so we can upload this in R Programming Language.

Reading a R CSV Files

The contents of a CSV file can be read as a data frame in R using the read.csv(…) function. The CSV file to be read should be either present in the current working directory or the directory should be set accordingly using the setwd(…) command in R. The CSV file can also be read from a URL using read.csv() function.

R




csv_data <- read.csv(file = 'C:\\Users\\GFG19565\\Downloads\\sample.csv')
print(csv_data)
 
# print number of columns
print (ncol(csv_data))
 
# print number of rows
print(nrow(csv_data))


Output:

       id,   name,    department,    salary,      projects
1 1 A HR 60754 14
2 2 B Tech 59640 3
3 3 C Marketing 69040 8
4 4 D HR 65043 5
5 5 E Tech 59943 2
6 6 F IT 65000 5
7 7 G HR 69000 7
[1] 4
[1] 7


We can upload the R Csv Files by passing its directory the header is by default set to a TRUE value in the function. The head is not included in the count of rows, therefore this CSV has 7 rows and 4 columns.

Querying with R CSV Files

SQL queries can be performed on the CSV content, and the corresponding result can be retrieved using the subset(csv_data,) function in R. Multiple queries can be applied in the function at a time where each query is separated using a logical operator. The result is stored as a data frame in R.

R




min_pro <- min(csv_data$projects)
print (min_pro)


Output:

2

Aggregator functions (min, max, count etc.) can be applied on the CSV data. Here the min() function is applied on projects column using $ symbol. The minimum number of projects which is 2 is returned.

R




# Selecting 'name' and 'salary' columns for employees with salary greater than 60000
result <- csv_data[csv_data$salary > 60000, c("name", "salary")]
 
# Print the result
print(result)


  name salary
1 A 60754
3 C 69040
4 D 65043
6 F 65000
7 G 69000

The subset of the data that is created is stored as a data frame satisfying the conditions specified as the arguments of the function. Selecting ‘name’ and ‘salary’ columns for employees with salary greater than 60000.

Writing into a R CSV Files

The contents of the data frame can be written into a CSV file. The CSV file is stored in the current working directory with the name specified in the function write.csv(data frame, output CSV name) in R.

R




# Calculating the average salary for each department
result <- tapply(csv_data$salary, csv_data$department, mean)
 
# Print the result
print(result)


Output:

            HR             IT      Marketing           Tech 
69000.0 62877.0 67041.5 59791.5

In this we will Calculating the average salary for each department.



Last Updated : 11 Mar, 2024
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads