Working with CSV files in R Programming

CSV files are basically the text files wherein the values of each row are separated by a delimiter, as in a comma or a tab. In this article, we will use the following sample CSV file:
sample.csv

id, name, department, salary, projects
1,   A,     IT,        60754,   4
2,   B,     Tech,      59640,   2
3,   C,     Marketing, 69040,   8
4,   D,     Marketing, 65043,   5
5,   E,     Tech,      59943,   2
6,   F,     IT,        65000,   5
7,   G,     HR,        69000,   7

Reading a CSV file

The contents of a CSV file can be read as a data frame in R using the read.csv(…) function. The CSV file to be read should be either present in the current working directory or the directory should be set accordingly using the setwd(…) command in R. The CSV file can also be read from a URL using read.csv() function.

Examples:

filter_none

edit
close

play_arrow

link
brightness_4
code

csv_data <- read.csv(file = 'sample.csv')
print(csv_data)
  
# print number of columns
print (ncol(csv_data)) 
  
# print number of rows
print(nrow(csv_data))  

chevron_right


Output:

       id,   name,    department,    salary,      projects
 1      1      A         HR          60754         14
 2      2      B        Tech         59640         3
 3      3      C       Marketing     69040         8
 4      4      D         HR          65043         5
 5      5      E        Tech         59943         2  
 6      6      F         IT          65000         5
 7      7      G         HR          69000         7
 [1] 4
 [1] 7

The header is by default set to a TRUE value in the function. The head is not included in the count of rows, therefore this CSV has 7 rows and 4 columns.



Querying with CSV files

SQL queries can be performed on the CSV content, and the corresponding result can be retrieved using the subset(csv_data,) function in R. Multiple queries can be applied in the function at a time where each query is separated using a logical operator. The result is stored as a data frame in R.

Examples:

filter_none

edit
close

play_arrow

link
brightness_4
code

csv_data <- read.csv(file ='sample.csv')
min_pro <- min(csv_data$projects)
print (min_pro)

chevron_right


Output:

2

Aggregator functions (min, max, count etc.) can be applied on the CSV data. Here the min() function is applied on projects column using $ symbol. The minimum number of projects which is 2 is returned.

filter_none

edit
close

play_arrow

link
brightness_4
code

csv_data <- read.csv(file ='sample.csv')
new_csv <- subset(csv_data, department == "HR" & projects <10)
print (new_csv)

chevron_right


Output:

      
      id,   name,    department,    salary,      projects
4      4      D         HR          65043         5
7      7      G         HR          69000         7

The subset of the data that is created is stored as a data frame satisfying the conditions specified as the arguments of the function. The employees D and G are HR and have the number of projects<10. The row numbers are retained in the resultant data frame.

Writing into a CSV file

The contents of the data frame can be written into a CSV file. The CSV file is stored in the current working directory with the name specified in the function write.csv(data frame, output CSV name) in R.

Examples:

filter_none

edit
close

play_arrow

link
brightness_4
code

csv_data <- read.csv(file ='sample.csv')
new_csv <- subset(csv_data, department == "HR" & projects <10)
write.csv(new_csv, "new_sample.csv")
new_data <-read.csv(file ='new_sample.csv')
print(new_data)

chevron_right


Output:

     
      X    id,   name,    department,    salary,      projects
1     4      4      D         HR          65043         5
2     7      7      G         HR          69000         7

The column X contains the row numbers of the original CSV file. In order to remove it, we can specify an additional argument in the write.csv() function that set row names to FALSE.

filter_none

edit
close

play_arrow

link
brightness_4
code

csv_data <- read.csv(file ='sample.csv')
new_csv <- subset(csv_data, department == "HR" & projects <10)
write.csv(new_csv, "new_sample.csv", row.names = FALSE)
new_data <-read.csv(file ='new_sample.csv')
print(new_data)

chevron_right


Output:

    
      id,   name,    department,    salary,      projects
1     4      D         HR          65043         5
2     7      G         HR          69000         7

The original row numbers are removed from the new CSV.




My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :

1


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.