Open In App

Read and Write Rectangular Text Data Quickly using R

Last Updated : 18 Oct, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Reading and writing rectangular text data quickly in R Programming Language can be achieved using various packages and functions, depending on your specific needs and the data format. Two commonly used packages for this purpose are readr and data. table. Here’s how you can do it with these packages.

Rectangular Text Data

  1. In R, to manipulate and analyze data, reading and writing rectangular text data is an important activity. Rows and columns of rectangular text data get divided in a tabular layout and are frequently separated by commas or tabs.
  2. Rectangular text data is ordered as rows and columns of structured data, commonly stored in text files. Every observation is represented by a row, and every variable or field is represented by a column.

To Read Rectangular Text Data, it’s important to define the delimiter, data types, and whether the first row contains column names. Saving time and memory through efficient reading is possible.

To Write Rectangular Text Data, export your R data structures, such as data frames, to text files. The output file’s location can be specified, along with the delimiter and character string quoting options.

There are various packages for reading and writing rectangular text data. Let’s talk about the two major packages that support CSV file, txt file, etc.

Different Packages To Read and Write Rectangular Text Data

data.table The data.table package generates a powerful and fast data manipulation framework in R. It is a reliable option when speed and performance are required because of its well-known efficiency in processing massive information. It is only for both CSV and txt files.

Functions to Use

  1. Fread – To read data, use the fread function and the file location.
  2. Fwrite – To write data, use the fwrite function and give the data and output file paths.

readr – The readr package is a component of the tidyverse ecosystem, a collection of R packages focused to simplify and speed up data manipulation and analysis. Readr also specializes in quickly and accurately reading rectangular text data sources. It is only for CSV files.

Functions to use

  1. write.csv – it is commonly used for writing data to and from CSV (Comma-Separated Values) files.
  2. read.csv– it is commonly used for reading data to and from CSV (Comma-Separated Values) files.

Create Rectangular text data

Let’s create a Rectangular text data of columns named “Name” , “Age”, “City” and story value in data.txt. Create a dataframe with columns “Name”, “Age”, “City” and store its value in respective column. And save the data frame to a tab-delimited text file and create a text file with name “data.txt”

Step 1: Create a dataframe with columns “Name”, “Age”, “City” and store its value in respective column.

data.frame() function create a dataframe. It has three columns: “Name”, “Age”, and “City”. Values for each column are provided using the c() function, which concatenates values into a vector for each column.

Step2: Save the data frame to a tab-delimited text file and create a text file with name “data.txt

write.table() function save the dataframe to a text file.
file = "data.txt" confirm the file name to be "data.txt".
sep = "\t" sets the separator as a tab.
row.names = FALSE says that row names are not present in the output.
quote = FALSE state that quotes are not used around character fields.

R




# Create a dataframe with columns "Name", "Age", "City" and store its value in respective column
data <- data.frame(
  Name = c("Mina", "Nora", "Tina", "Ram"),
  Age = c(25, 30, 22, 28),
  City = c("India", "Germany", "NY", "America")
)
# Save the data frame to a tab-delimited text file and create a text file
#with name "data.txt"
write.table(data, file = "data.txt", sep = "\t", row.names = FALSE,
            quote = FALSE)


Output:

Name    Age    City
Mina 25 India
Nora 30 Germany
Tina 22 NY
Ram 28 America

We can get this file in files section in R enviourment.

Screenshot-from-2023-10-06-12-55-57

How to Read and Write Rectangular Text Data Quickly

This code will create a table in text document named as “data.txt” and store at the same folder where R studio is

Efficient Data Manipulation with data.table

Now, it’s time to install package “data.table” and load it

Activating the “data.table” Package (library(data.table)) – This line activates the “data.table” package, This package generates a powerful and fast data manipulation framework in R.

R




install.packages("data.table")
 
library(data.table)


Reading Rectangular Text Data (for txt file)

Read data.txt with tab delimiter, headers, and specified column types as name is character, age is numeric and country is character.

  • fread() is a function from the “data.table” package in R reads for fast and efficient data from different file formats.
  • header = TRUE confirms that the first row in the file contains column names.
  • colClasses = c(“character”, “numeric”, “character”) tells the classes of the columns in the output dataframe.
  • “data.txt” is the file path of the text file to be read.

R




# Read data.txt with tab delimiter, headers, and specified column types
#as name is character, age is numeric and country is character
 
data <- fread("data.txt", sep = "\t", header = TRUE,
              colClasses = c("character", "numeric", "character"))
 
data


Output:

   Name Age    City
1: Mina 25 India
2: Nora 30 Germany
3: Tina 22 NY
4: Ram 28 America

Writing Rectangular Text Data (for txt file)

Assuming ‘data’ is your rectangular data. Write data to output.txt with tab delimiter, no quoting! Data is the dataframe that will be written to the text file “Output.txt”. It is an output file. ‘sep = “\t”‘ sets the separator as a tab. ‘quote = FALSE’ state that quotes are not used around character fields.

R




# Write data to output.txt with tab delimiter, no quoting
 
fwrite(data, "output.txt", sep = "\t", quote = FALSE)


Output:

   Name Age    City
1: Mina 25 India
2: Nora 30 Germany
3: Tina 22 NY
4: Ram 28 America

Reading and writing Rectangular Text Data in CSV file

Let’s say we have datset containing employee information about their salary and department. So, we want information of Gross having greater than 4000. The output file will contain the information of Gross having greater than 4000 only. Use dataset Employee_monthly_salary

Step 1 – Activating the “data.table” Package (library(data.table)) – This line activates the “data.table” package, This package generates a powerful and fast data manipulation framework in R.

Step 2 – Large CSV File Reading into a Data Table (fread()) fread(“Employee_monthly_salary.csv”) reads a large CSV file and loads it into a data table named large_data. The “data.table” package’s method fread() is used to read data, use the fread function and the file location.

Step 3 – Rows Are Filtered Based on the Criteria ([GROSS > 50000]) ! The filtering function large_data[GROSS > 50000] allows only the rows of the large_data data table where the “Gross payment” (likely a column in the CSV) is greater than 50000. This uses the “data.table” syntax for a conditional subsetting action.

Step 4- The filtered data is kept in a brand-new table called filtered_data. Filtered Data Writing to a New CSV File (fwrite()). The function fwrite(filtered_data, “filtered_sales.csv”) generates a new CSV file called “filtered_sales.csv” from the filtered data contained in filtered_data.

R




library(data.table)
 
# Read a large CSV file into a data table
large_data <- fread("Employee_monthly_salary.csv")
 
# Filter rows where the 'Gross payment' column is greater than 50000
filtered_data <- large_data['GROSS' > 50000]
 
# Write the filtered data to a new CSV file
fwrite(filtered_data, "filtered_sales.csv")


Output:

Screenshot-2023-09-13-220223

Output

Reading and writing Rectangular Text Data Using readr Package

Suppose we have datset containing student information about their name, class and marks etc. So, we want information of category = ‘ female’. The output file will contain the information of female student only. The readr package is primarily used for reading structured text data into R. It provides functions to efficiently read various types of delimited files like CSV, TSV, and fixed-width format files. Here are some additional examples and explanations of how to use the readr package in R. Use dataset StudentsPerformance

Step 1: The “readr” package from CRAN (Comprehensive R Archive Network) get installed with install.packages(“readr”). The R environment is loaded with the “readr” package within library(readr).

Step 2: The read_csv() function only reads the CSV file into a dataframe: read_csv(), a function from the “readr” package, reads a CSV file and creates a dataframe. sales_data – read_csv(“StudentsPerformance.csv”) reads the CSV file “StudentsPerformance.csv” and load the data in a dataframe named sales_data.

Step 3: modified_data <- sales_data’female’ is the value for sales_data$gender. Adjusting the sales_data dataframe to only contain rows where the “gender” column is “female” results in the creation of a new dataframe called modified_data. To filter rows based on the “gender” column, indexing and a logical condition are used.

Step 4: Write_csv(modified_data, “femalecategory.csv”) generates a new CSV file called “femalecategory.csv” from the modified_data dataframe. The “readr” package’s write_csv() function writes a dataframe to a CSV file.

R




install.packages("readr")
library(readr)
sales_data <- read_csv("StudentsPerformance.csv")
 
print(sales_data)
# Make some modifications to the data
modified_data <- sales_data[sales_data$gender == 'female',]
 
# Write the modified data to a new CSV file
write_csv(modified_data, "femalecategory.csv")


Output:

Screenshot-2023-09-13-220241

output

Conclusion

reading and writing rectangular text data efficiently in R can be achieved using two widely used packages: readr and data.table. These packages provide functions that are optimized for speed and memory efficiency, making them suitable for working with both small and large datasets.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads