How to Import a CSV File into R ?
A CSV file is used to store contents in a tabular-like format, which is organized in the form of rows and columns. The column values in each row are separated by a delimiter string. The CSV files can be loaded into the working space and worked using both in-built methods and external package imports.
Method 1: Using read.csv() method
The read.csv() method in base R is used to load a .csv file into the present script and work with it. The contents of the csv can be stored into the variable and further manipulated. Multiple files can also be accessed in different variables. The output is returned to the form of a data frame, where row numbers are assigned integers beginning with 1.
Syntax: read.csv(path, header = TRUE, sep = “,”)
- path : The path of the file to be imported
- header : By default : TRUE . Indicator of whether to import column headings.
- sep = “,” : The separator for the values in each row.
ID Name Post Age 1 5 H CA 67 2 6 K SDE 39 3 7 Z Admin 28
In case, the header is set to FALSE, the column names are ignored, and default variables names are displayed for each column beginning from V1.
V1 V2 V3 V4 1 5 H CA 67 2 6 K SDE 39 3 7 Z Admin 28
Method 2: Using read_csv() method
The “readr” package in R is used to read large flat files into the working space with increase speed and efficiency.
The read_csv() method reads a csv file reading one line at a time. The data using this method is read in the form of a tibble, of the same dimensions as of the table stored in the .csv file. Only ten rows of the tibble are displayed on the screen and rest are available after expanding, which increases the readability of the large files. This method is more efficient since it returns more information about the column types. It also displays progress tracker for the percentage of file read into the system currently if the progress argument is enabled, therefore being more robust. This method is also faster in comparison to the base R read.csv() method.
Syntax: read_csv (file-path , col_names , n_max , col_types , progress )
- file-path : The path of the file to be imported
- col_names : By default, it is TRUE. If FALSE, the column names are ignored.
- n_max : The maximum number of rows to read.
- col_types : If any column succumbs to NULL, then the col_types can be specified in a compact string format.
- progress : A progress meter to analyse the percentage of file read into the system