Open In App

Predictive Analysis in R Programming

Last Updated : 01 Jun, 2020
Improve
Improve
Like Article
Like
Save
Share
Report

Predictive analysis in R Language is a branch of analysis which uses statistics operations to analyze historical facts to make predict future events. It is a common term used in data mining and machine learning. Methods like time series analysis, non-linear least square, etc. are used in predictive analysis. Using predictive analytics can help many businesses as it finds out the relationship between the data collected and based on the relationship, the pattern is predicted. Thus, allowing businesses to create predictive intelligence.

In this article, we’ll discuss the process, need and applications of predictive analysis with example codes.

Process of Predictive Analysis

Predictive analysis consists of 7 processes as follows: 

  • Define project: Defining the project, scope, objectives and result.
  • Data collection: Data is collected through data mining providing a complete view of customer interactions.
  • Data Analysis: It is the process of cleaning, inspecting, transforming and modelling the data.
  • Statistics: This process enables validating the assumptions and testing the statistical models.
  • Modelling: Predictive models are generated using statistics and the most optimized model is used for the deployment.
  • Deployment: The predictive model is deployed to automate the production of everyday decision-making results.
  • Model monitoring: Keep monitoring the model to review performance which ensures expected results.

Need of Predictive Analysis

  • Understanding customer behavior: Predictive analysis uses data mining feature which extracts attributes and behavior of customers. It also finds out the interests of the customers so that business can learn to represent those products which can increase the probability or likelihood of buying.
  • Gain competition in the market: With predictive analysis, businesses or companies can make their way to grow fast and stand out as a competition to other businesses by finding out their weakness and strengths.
  • Learn new opportunities to increase revenue: Companies can create new offers or discounts based on the pattern of the customers providing an increase in revenue.
  • Find areas of weakening: Using these methods, companies can gain back their lost customers by finding out the past actions taken by the company which customers didn’t like.

Applications of Predictive Analysis

  • Health care: Predictive analysis can be used to determine the history of patient and thus, determining the risks.
  • Financial modelling: Financial modelling is another aspect where predictive analysis plays a major role in finding out the trending stocks helping the business in decision making process.
  • Customer Relationship Management: Predictive analysis helps firms in creating marketing campaigns and customer services based on the analysis produced by the predictive algorithms.
  • Risk Analysis: While forecasting the campaigns, predictive analysis can show an estimation of profit and helps in evaluating the risks too.

Example:

Let us take an example of time analysis series which is a method of predictive analysis in R programming:




x <- c(580, 7813, 28266, 59287, 75700,  
       87820, 95314, 126214, 218843, 471497
       936851, 1508725, 2072113
     
# library required for decimal_date() function 
library(lubridate) 
     
# output to be created as png file 
png(file ="predictiveAnalysis.png"
     
# creating time series object 
# from date 22 January, 2020 
mts <- ts(x, start = decimal_date(ymd("2020-01-22")), 
                             frequency = 365.25 / 7
     
# plotting the graph 
plot(mts, xlab ="Weekly Data of sales"
          ylab ="Total Revenue"
          main ="Sales vs Revenue",  
          col.main ="darkgreen"
     
# saving the file  
dev.off() 


Output:

Forecasting Data:

Now, forecasting sales and revenue based on historical data.




x <- c(580, 7813, 28266, 59287, 75700,  
       87820, 95314, 126214, 218843,  
       471497, 936851, 1508725, 2072113
     
# library required for decimal_date() function 
library(lubridate) 
     
# library required for forecasting 
library(forecast) 
     
# output to be created as png file 
png(file ="forecastSalesRevenue.png"
     
# creating time series object 
# from date 22 January, 2020 
mts <- ts(x, start = decimal_date(ymd("2020-01-22")), 
                            frequency = 365.25 / 7
     
# forecasting model using arima model 
fit <- auto.arima(mts) 
     
# Next 5 forecasted values 
forecast(fit, 5
     
# plotting the graph with next  
# 5 weekly forecasted values 
plot(forecast(fit, 5), xlab ="Weekly Data of Sales"
ylab ="Total Revenue"
main ="Sales vs Revenue", col.main ="darkgreen"
     
# saving the file  
dev.off() 


Output:



Similar Reads

Calculate Sensitivity, Specificity and Predictive Values in CARET
R programming is used for predictive data analysis we all know that. In R the "CARET" package is used to train and evaluate machine learning models. The 'caret' package (short for Classification and Regression Training) is a comprehensive and powerful tool that provides a unified interface and a wide range of functions that simplify the process of
9 min read
Principal Component Analysis with R Programming
Principal component analysis(PCA) in R programming is an analysis of the linear components of all existing attributes. Principal components are linear combinations (orthogonal transformation) of the original predictor in the dataset. It is a useful technique for EDA(Exploratory data analysis) and allows you to better visualize the variations presen
3 min read
Social Network Analysis Using R Programming
Social Network Analysis (SNA) is the process of exploring or examining the social structure by using graph theory. It is used for measuring and analyzing the structural properties of the network. It helps to measure relationships and flows between groups, organizations, and other connected entities. Before we start let us see some network analysis
6 min read
Performing Analysis of a Factor in R Programming - factanal() Function
Factor Analysis also known as Exploratory Factor Analysis is a statistical technique used in R programming to identify the inactive relational structure and further, narrowing down a pool of variables to few variables. The main motive to use this technique is to find out which factor is most responsible for influence in the categorization of weight
2 min read
Regression Analysis in R Programming
In statistics, Logistic Regression is a model that takes response variables (dependent variable) and features (independent variables) to determine the estimated probability of an event. A logistic model is used when the response variable has categorical values such as 0 or 1. For example, a student will pass/fail, a mail is a spam or not, determini
6 min read
Perform Probability Density Analysis on t-Distribution in R Programming - dt() Function
dt() function in R Language is used to return the probability density analysis on the Student t-distribution with random variable x and degree of freedom df. Syntax: dt(x, df) Parameters: x: Random Variable df: Degree of Freedom Example 1: # R Program to perform # Probability Density Analysis # Calling dt() Function dt(0, 10) dt(1, 15) dt(3, 40) Ou
1 min read
Perform the Probability Cumulative Density Analysis on t-Distribution in R Programming - pt() Function
pt() function in R Language is used to return the probability cumulative density of the Student t-distribution. Syntax: pt(x, df) Parameters: x: Random variable df: Degree of Freedom Example 1: # R Program to perform # Cumulative Density Analysis # Calling pt() Function pt(2, 10) pt(.542, 15) Output: [1] 0.963306 [1] 0.7021105 Example 2: # R Progra
1 min read
Perform the Inverse Probability Cumulative Density Analysis on t-Distribution in R Programming - qt() Function
qt() function in R Language is used to return the inverse probability cumulative density of the Student t-distribution. Syntax: qt(x, df)Parameters: x: Random variable df: Degree of Freedom Example 1: C/C++ Code # R Program to perform Inverse # Cumulative Density Analysis # Calling qt() Function qt(.95, 10) qt(.542, 15) Output: [1] 1.812461 [1] 0.1
1 min read
Perform Linear Regression Analysis in R Programming - lm() Function
lm() function in R Language is a linear model function, used for linear regression analysis. Syntax: lm(formula) Parameters: formula: model description, such as x ~ y Example 1: # R program to illustrate # lm function # Creating two vectors x and y x &lt;- c(rep(1:20)) y &lt;- x * 2 # Calling lm() function to # fit a linear model f &lt;- lm(x ~ y)
1 min read
Linear Discriminant Analysis in R Programming
One of the most popular or well established Machine Learning technique is Linear Discriminant Analysis (LDA ). It is mainly used to solve classification problems rather than supervised classification problems. It is basically a dimensionality reduction technique. Using the Linear combinations of predictors, LDA tries to predict the class of the giv
6 min read