Open In App

How to write a SQL query in R?

Last Updated : 05 Feb, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we are going to learn how to write SQL queries in the R programming language.

What is SQL Query?

SQL stands for Structured Query Language. SQL queries are used for interacting with a database. Using SQL queries we can access and manipulate data stored in the database. With the help of SQL queries, we can create, read, update and delete data on databases and performs lots more operations on the database.

sqldf package

We are going to use sqldf package to run SQL queries in R. Sqldf is a convenient R tool that allows the execution of SQL operations on R data frames. The databases MySQL, PostgreSQL, H2, and SQLite can all be used with sqldf. We can run SQL queries in R using sqldf package.

Note: Working with SQL becomes very easy in R if we have a good understanding of the SQL commands. Click here to learn the basics of SQL.

Installing and importing sqldf package

Run the below command in R to install sqldf package.

R




# installing sqldf package
install.packages("sqldf")


Now after the installation of sqldf package we have to import it to use its functionalities.

To import sqldf package run the below command in R.

R




# Importing sqldf
library(sqldf)


Performing basic SQL operations using sqldf

We are going to use IRIS data set with ten entries in it on which we are performing various operations using SQL queries.

Reading data using SQL query

Example 1:

In this, we are going to use the SQL select command to view the data of the data frame. For that firstly we import sqldf library and then read the CSV file and store it into a variable “df” as a data frame. After that we are displaying data using select command.

R




# Importing sqldf library
library(sqldf)
# Reading CSV file
df <- read.csv("iris.csv")
# Reading data from data frame
# using SQL select query
sqldf("select * from df")


Output:

 

 Example 2:

In this example, we are going to read a particular column from the data frame by selecting a particular column using the SQL query

R




# Importing sqldf library
library(sqldf)
# Reading CSV file
df <- read.csv("iris.csv")
# Selecting columns from data frame
sqldf("select [sepal_length],
[sepal_width], [species] from df")


Output:

 

The output of the above code shows the data of three columns i.e. sepal_lenght, sepal_width, and species because we select only these three columns in our SQL query.

Sorting the data using SQL query

In this code, we are using order by clause in SQL select statement due to which our data is displayed by sorting the data of the “sepal_length” column as seen in the output.

R




# Importing sqldf library
library(sqldf)
# Reading CSV file
df <- read.csv("iris.csv")
# Reading data from data frame
# by applying sorting on a column
sqldf("select * from df order by sepal_length")


Output:

 

Note: The default order of sort is ascending, to sort the data in descending order we can use the desc keyword at the end of the statement.

Max and Min using SQL query

In this, we are going to use min() and max() functions in the select command using sqldf to get the minimum and maximum value of the column of the data frame.

R




# Importing sqldf library
library(sqldf)
# Reading CSV file
df <- read.csv("iris.csv")
# Applying min() and max() functions
# to get minimum and maximum value.
sqldf("select max(sepal_length) from df")
sqldf("select min(sepal_length) from df")


Output:

 

 

Note: Similar to the functions mentioned above we can also perform any operation on the dataset for a specific data value.

Deleting data using SQL Query

In this, we are going to use the delete command of the SQL query in R to delete the rows of the data frame. As we can see in the output all records of “Iris-virginica” are deleted from the data frame.

R




# Importing sqldf library
library(sqldf)
# Reading CSV file
df<-read.csv("iris.csv")
# Deleting rows where
# species is Iris-virginica
df<-sqldf(c("delete from df where 
        species='Iris-virginica'",
              "select * from df"))
print("After delete rows")
# Displaying data frame
sqldf("select * from df")


Output:

How to write a SQL query in R?

 

Updating data using SQL Query

In this, we are going to use the update command of SQL query in R to update the record in a data frame. As we can see in the output all the species names of “Iris-versicolor” change to “versicolor”.

R




# Importing sqldf library
library(sqldf)
# Reading CSV file
df<-read.csv("iris.csv")
# Update species name from
# Iris-versicolor to versicolor
print("Before update")
sqldf("select * from df")
# Updating values
df <- sqldf(c("update df set species='versicolor'
            where species='Iris-versicolor'",
            "select * from df"))
print("After update")
sqldf("select * from df")


Output:

 



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads