Non-Linear Regression in R

Non-Linear Regression is a form of regression analysis in which function models observational data is a nonlinear combination with non-linear parameters To perform non-linear regression in R, you can use various functions and packages, including ‘nls’, ‘nlme‘, and ‘mgcv‘. You need to provide the equation of the model and the data, and the function will estimate the parameters of the equation that best fit the data. An example of non-linear regression that can be used in predicting population growth over time, the GDP of a country, etc. 

What is Non-Linear Regression?

Non-Linear Regression is a statistical method that is used to model the relationship between a dependent variable and one of the independent variable(s). Unlike linear regression, where the relationship between the dependent and independent variables is linear, in non-linear regression, the relationship is modeled using a non-linear equation. This means that the model can capture more complex and non-linear relationships between the variables, but also requires more computational resources and a more sophisticated approach to estimate the model’s parameters. Kindly go through the link for types of regression 

What is R?

R is an open-source programming language widely used as a statistical software and data analysis tool available across widely used platforms like Windows, MacOS, and Linux. R generally comes with the Command-line interface. 

To get started run the following line in the console:


Example 1

The code given is of Exponential regression in R which uses the ggplot2 and nls libraries. The model is an exponential function. The first is by loading the library and generating some data for the independent variable X and the dependent variable Y. Then, the exponential regression model is fit using the nls function. The nls function fits a non-linear model to the data using a formula that defines the relationship between the dependent and independent variables. In this case, the formula is y ~ a * exp(b * x), which represents an exponential function with parameters a and b. The starting values of these parameters are specified in the start argument with the list(a=4, b=2).

The exponential formula is given as 

# imports library
# generate data
x <- c(0, 1, 2, 3, 4, 5)
y <- c(1, 2, 4, 8, 16, 32)
# fit the model
start_values <- c(a=4, b=2)
fit <- nls(y ~ a * exp(b * x),
           start = start_values,
           algorithm = "port",
           control = nls.control(maxiter = 1000))



Formula: y ~ a * exp(b * x)

   Estimate Std. Error   t value Pr(>|t|)    
a 1.000e+00  6.277e-14 1.593e+13   <2e-16 ***
b 6.931e-01  1.331e-14 5.206e+13   <2e-16 ***
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.247e-13 on 4 degrees of freedom

Algorithm "port", convergence message: absolute function convergence (6)

Plot the Exponential Regression line with points

# plotting
ggplot(data.frame(x, y), aes(x, y)) +
  geom_point() +
  geom_line(aes(x, predict(fit, newdata = data.frame(x)))) +
  ggtitle("Exponential Regression") +
  xlab("x") +



Exponential Regression model generated using random data

 Example 2

The code given is of Polynomial regression of degree 2 in R which uses the ggplot2 and the lm (linear model) function from the R library. The data generated has the x variable defined as a sequence of 10 integers (1 to 10) and the y variable is defined as x2 + x + 2 + random noise. The random noise is generated using the rnorm function with a mean of 0 and a standard deviation of  10. A data frame is created with the x and y variables and stored in the df variable. The polynomial regression model is fit to the data using the lm function, with the y variable as the response and the poly(x, 2) as the predictor. 

The polynomial equation is given as

x <- 1:10
y <- x^2 + x + 2 + rnorm(10, 0, 10)
df <- data.frame(x, y)
#fitting the model
fit <- lm(y ~ poly(x, 2), data = df)



lm(formula = y ~ poly(x, 2), data = df)

   Min     1Q Median     3Q    Max 
-9.975 -5.664  2.832  5.523  6.885 

            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   47.140      2.395  19.686 2.18e-07 ***
poly(x, 2)1   85.981      7.573  11.354 9.21e-06 ***
poly(x, 2)2   18.278      7.573   2.414   0.0465 *  
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 7.573 on 7 degrees of freedom
Multiple R-squared:  0.9506,    Adjusted R-squared:  0.9365 
F-statistic: 67.37 on 2 and 7 DF,  p-value: 2.676e-05

Plot the regression line

#plotting the model
ggplot(df, aes(x, y)) +
  geom_point() +
  geom_line(aes(x, predict(fit))) +
  ggtitle("Polynomial Regression")



Polynomial Regression generated for numbers 1 to 10

 Example 3

The code given is of Cubic regression in R which uses the ggplot2 and the lm (linear model) function from the R library. The data generated has the x variable defined as a sequence of 10 integers (1 to 10) and the y variable is defined as  x3 – 2 x2 + x + 2 + random noise. The random noise is generated using the rnorm function with a mean of 0 and a standard deviation of  10. A data frame is created with the x and y variables and stored in the df variable. The Cubic regression model is fit to the data using the lm function.

The formula for Cubic equation is given as

x <- 1:10
y <- x^3 - 2 * x^2 + x + 2 + rnorm(10, 0, 10)
df <- data.frame(x, y)
fit <- lm(y ~ poly(x, 3), data = df)



lm(formula = y ~ poly(x, 3), data = df)

    Min      1Q  Median      3Q     Max 
-11.268  -7.677  -4.150   7.141  16.517 

            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  234.036      3.826  61.169 1.28e-09 ***
poly(x, 3)1  748.529     12.099  61.867 1.20e-09 ***
poly(x, 3)2  328.759     12.099  27.172 1.64e-07 ***
poly(x, 3)3   61.231     12.099   5.061  0.00231 ** 
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 12.1 on 6 degrees of freedom
Multiple R-squared:  0.9987,    Adjusted R-squared:  0.998 
F-statistic:  1530 on 3 and 6 DF,  p-value: 4.86e-09

Plot the regression line

#plotting the regression line
ggplot(df, aes(x, y)) +
  geom_point() +
  geom_line(aes(x, predict(fit))) +
  ggtitle("Cubic Regression")



Cubic regression x from 1 to 10

Example 4

The code given is of Quadratic regression in R which uses the ggplot2 and the lm (linear model) function from the R library. The data generated has the x variable defined as a sequence of 10 integers (1 to 10) and the y variable is defined as  x2 + 2 x + 2 + random noise. The random noise is generated using the rnorm function with a mean of 0 and a standard deviation of  10. A data frame is created with the x and y variables and stored in the df variable. The quadratic regression model is fit to the data using the lm function.

The Quadratic Formula is 

#x values from 1 to 10
x <- 1:10
#quadratic equation
y <- x^2 + 2 * x + 2 + rnorm(10, 0, 10)
#creating data frame
df <- data.frame(x, y)
#fitting  the model
fit <- lm(y ~ poly(x, 2), data = df)



lm(formula = y ~ poly(x, 2), data = df)

     Min       1Q   Median       3Q      Max 
-13.2189  -7.5274   0.0416   5.6517  16.7008 

            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   51.901      3.384  15.339 1.21e-06 ***
poly(x, 2)1  102.037     10.700   9.536 2.92e-05 ***
poly(x, 2)2   19.363     10.700   1.810    0.113    
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 10.7 on 7 degrees of freedom
Multiple R-squared:  0.9308,    Adjusted R-squared:  0.9111 
F-statistic: 47.11 on 2 and 7 DF,  p-value: 8.699e-05

Plot the regression line

#import libraries
#plotting the model
ggplot(df, aes(x, y)) +
  geom_point() +
  geom_line(aes(x, predict(fit))) +
  ggtitle("Quadratic Regression")



Quadratic equation with x equal to 1 through 10

