Time series data is hierarchical data. It is a series of data associated with a timestamp. An example of a time series is gold prices over a period or temperature range or precipitation during yearly storms. To visualize this data, R provides a handy library called ggplot. Using ggplot, we can see all sorts of plots. Along with ggplot, R also provides libraries to clean up data and transform or manipulate it to fit our visualization requirements.
This article will look at one dataset from the R datasets and one dataset obtained from a CSV file.
Dataset 1: EU Covid deaths for March 2020
The dataset gives us the daily death counts from Covid-19 for all European Countries for March 2020. We will plot the number of deaths(y-axis) vs. day(x-axis) for every country.
Data in use can be downloaded from here.
Plot 1: Daily Death Count
The steps for plotting are as follows:
- Open R Studio and open an R notebook (has more options).
- Save this file as .rmd, preferably in the same folder as your data.
- Select the Working directory to where your data is
- Import all the R libraries
- Read the data from the CSV.
- The data above is spread across columns. To make plotting easier, we need to format the data in the required format.
- Plot data
- Display data
Plot 2: Plotting covid deaths per capita.
We will be using the same data as the previous example. But here we will be dealing with per capita data.
Dataset 2: Rainfall for US counties during tropical storms.
First install the package: hurricaneexposuredata
Before installing the package, please check the R version. To check the R version in RStudio go to Tools -> Global Options. In the window that opens, in the Basic Tab, we see the R version.
#If the R vesion is the greater than 4
#For R versions lower than 4.0, please install this way
install.packages(‘hurricaneexposuredata’, repos=’https://geanders.github.io/drat/’, type=’source’)