Open In App

How to Fix in R: glm.fit: algorithm did not converge

Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will discuss how to fix the “glm.fit: algorithm did not converge” error in the R programming language.

glm.fit: algorithm did not converge is a warning in R that encounters in a few cases while fitting a logistic regression model in R. It encounters when a predictor variable perfectly separates the response variable. To get a better understanding let’s look into the code in which variable x is considered as the predictor variable and y is considered as the response variable. To produce the warning, let’s create the data in such a way that the data is perfectly separable.

Code that produces a warning:

The below code doesn’t produce any error as the exit code of the program is 0 but a few warnings are encountered in which one of the warnings is glm.fit: algorithm did not converge. This was due to the perfect separation of data. From the data used in the above code, for every negative x value, the y value is 0 and for every positive x, the y value is 1.

R




# create random data which consists
# of 50 numbers
x < - rnorm(50)
 
# create data with fifty 1's
y < - rep(1, 50)
 
# if x value is less than 0 the at that
# index replace 1 with 0 in y
y[x < 0] < - 0
 
# create dataframe
data < - data.frame(x, y)
 
# first 6 rows
head(data)
 
# fitting logistic regression model
glm(y ~ x, data, family="binomial")


Output

          x y

1  1.3295285 1

2 -0.9738028 0

3  0.6963700 1

4 -1.1586337 0

5 -1.1001865 0

6 -0.6252191 0

Call:  glm(formula = y ~ x, family = “binomial”, data = data)

Coefficients:

(Intercept)            x  

     -13.42       273.54  

Degrees of Freedom: 49 Total (i.e. Null);  48 Residual

Null Deviance:    68.03 

Residual Deviance: 1.436e-08 AIC: 4

Warning messages:

1: glm.fit: algorithm did not converge 

2: glm.fit: fitted probabilities numerically 0 or 1 occurred 

[Execution complete with exit code 0]

How to fix the warning:

To overcome this warning we should modify the data such that the predictor variable doesn’t perfectly separate the response variable. In order to do that we need to add some noise to the data. Below is the code that won’t provide the algorithm did not converge warning.

R




# create random data which consists of
# 50 numbers
 
x <- rnorm(50)
# create data with fifty 1's
y <- rep(1, 50)
 
# if x value is less than 0 the at that
# index replace 1 with 0 in y
y[x < 0] <- 0
 
# create dataframe
data <- data.frame(x, y)
 
# first 6 rows
head(data)  
 
# add noise
data$x <- data$x + rnorm(50)
 
# first 6 rows after data modification
head(data)
 
# fitting logistic regression model
glm(y ~ x, data, family = "binomial")


Output

           x y

1 -0.5787936 0

2  0.1105818 1

3 -0.5324901 0

4  0.6043288 1

5 -0.2479408 0

6  1.2583220 1

           x y

1 0.06909437 0

2 2.01936841 1

3 0.08818184 0

4 0.22230790 1

5 0.19720200 0

6 1.44250592 1

Call:  glm(formula = y ~ x, family = “binomial”, data = data)

Coefficients:

(Intercept)            x  

    0.09985      1.97047  

Degrees of Freedom: 49 Total (i.e. Null);  48 Residual

Null Deviance:    69.23 

Residual Deviance: 40.85 AIC: 44.85

[Execution complete with exit code 0]

Here the original data of the predictor variable get changed by adding random data (noise). So it disturbs the perfectly separable nature of the original data. This process is completely based on the data. If the correlation between any two variables is unnaturally very high then try to remove those observations and run the model until the warning message won’t encounter.

Warning Handling

There are two ways to handle this glm.fit: the algorithm did not converge warning. They are listed below-

  • Use penalized regression
  • Use the predictor variable to perfectly predict the response variable

Method 1: Use penalized regression:

We can use the penalized logistic regression such as lasso logistic regression or elastic-net regularization to handle the algorithm that did not converge warning. In order to perform penalized regression on the data, glmnet method is used which accepts predictor variable, response variable, response type, regression type, etc. Let’s look into the syntax of it-

Syntax: glmnet(x, y, family = “binomial”, alpha = 1, lambda = NULL)

where

  • x is predictor variable
  • y is response variable
  • family indicates the response type, for binary response (0,1)  use binomial
  • alpha represents type of regression
    • 1 is for lasso regression
    • 0 is for ridge regression

Lambda defines the shrinkage

Below is the implemented penalized regression code

R




# import necessary libraries
library(glmnet)
 
# create random data which consists
# of 50 numbers
x < - rnorm(50)
 
# create data with fifty 1's
y < - rep(1, 50)
 
# if x value is less than 0 the at that
# index replace 1 with 0 in y
y[x < 0] < - 0
 
# fitting lasso regression model
glmnet(x, y, family="binomial", alpha=1, lambda=NULL)


Method 2: Use the predictor variable to perfectly predict the response variable

When there is perfect separability in the given data, then it’s easy to find the result of the response variable by the predictor variable. The data we considered in this article has clear separability and for every negative predictor variable the response is 0 always and for every positive predictor variable, the response is 1. So we can perfectly predict the response variable using the predictor variable.

Example:

Below is the code that predicts the response variable using the predictor variable with the help of predict method.

R




# create random data which consists of
# 5 numbers
x < - rnorm(5)
 
# create data with five 1's
y < - rep(1, 5)
 
# if x value is less than 0 the at that index
# replace 1 with 0 in y
y[x < 0] < - 0
 
# create dataframe
data1 < - data.frame(x, y)
 
data1
 
# create a linear model
model < - glm(y ~ x, data1, family="binomial")
 
# predicting response variables
predict(model, newdata=data.frame(y=c(0, 0, 1, 1, 1)))


Output

   x         y
1 -0.4057154 0
2  1.9408241 1
3 -0.2419725 0
4  0.2374463 1
5 -1.6208003 0
Warning message:
glm.fit: fitted probabilities numerically 0 or 1 occurred 
         1          2          3          4          5 
 -39.25575  189.68953   23.27980   23.49574 -157.80817 

[Execution complete with exit code 0]


Last Updated : 04 Jul, 2022
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads