GE Stock Price Analysis Using R Language
Stock analysis is a technique used by investors and traders to make purchasing and selling choices. Investors and traders strive to obtain an advantage in the markets by making educated judgments by researching and analyzing previous and current data.
In this article, we will analyze the ‘GE Stock Price’ dataset using the R Programming Language.
The used libraries:
- dplyr – The dplyr package of the R Programming Language is a data manipulation framework that provides a consistent set of verbs to help with the resolution of the most frequent data manipulation difficulties faced.
- stringi – The stringi package in the R programming language is used for string/text processing in any locale or character encoding that is highly fast, portable, accurate, consistent, and easy.
We will initially start by installing the dplyr package and the stringi library
We will now read the CSV file in order to perform the analysis.
We start by defining the path where the CSV file is located on our local machine. We then read the CSV file and store the data in variable names df. We further display the content of the data frame ‘df’
The dataset can be downloaded and accessed from the following link: here
We will now explore the numerical characteristics ‘StockPrice’ column. We obtained a numerical summary of the Stock price column using the ‘summary’ function. This displays the Minimum Stock Price, the 1st Quartile, the median Stock Price, the mean Stock Price, the 3rd Quartile, and the maximum Stock Price.
We will now produce a histogram of the Stock Price Data using the ‘hist’ function. We pass the ‘Stock Price’ column and apply labels and headings to the histogram. We add color to the histogram using the col parameter.
We will now produce a density-based histogram of the Stock Price Data. To produce the density-based histogram, we have used the ‘lines’ function and passed the kernel density values Stock Price column. To obtain the kernel density values we have used the density function. We have then applied labels and headings to the histogram. We add color to the histogram using the col parameter.
We will now plot a line graph of the Stock Price data that would be segregated year-wise. We have extracted the year from the Date. (We have extracted the last 2 digits of the date that represents the year).
For extracting only the year from the date column we have followed this process:
- Check the class of the Date column
- Convert the Date column’s data to a character data type
- Extract only the year from the Date column using a stri_sub function and apply it to all the data using the mutate function
We then finally plot the graph of the data. We then pass the mutated df’s date column as the label for the x-axis
We will now group the data based on their year and plot a bar graph of the maximum stock price in that year
We firstly group the data of Mutated df based on the year. and then we get the maximum stock price for each year and finally using the ‘bar plot’ function we plot the bar graph showing the maximum stock price in each year.
The study shown above may be used to comprehend a stock’s short-term and long-term behavior. Depending on the risk tolerance of the investor, a decision support system may be further developed to aid the user in choosing which stock to select from the industry.
Note: In the date column, only the last 2 digits of the date are there. Hence 1/1/70 means 1st January 1970.