XGBoost is an implementation of Gradient Boosted decision trees. This library was written in C++. It is a type of Software library that was designed basically to improve speed and model performance. It has recently been dominating in applied machine learning. XGBoost models majorly dominate in many Kaggle Competitions.

In this algorithm, decision trees are created in sequential form. Weights play an important role in XGBoost. Weights are assigned to all the independent variables which are then fed into the decision tree which predicts results. Weight of variables predicted wrong by the tree is increased and these the variables are then fed to the second decision tree. These individual classifiers/predictors then ensemble to give a strong and more precise model. It can work on regression, classification, ranking, and user-defined prediction problems.

**XGBoost Features**

The library is laser-focused on computational speed and model performance, as such, there are few frills.

**Model Features**

Three main forms of gradient boosting are supported:

- Gradient Boosting
- Stochastic Gradient Boosting
- Regularized Gradient Boosting

**System Features**

- For use of a range of computing environments this library provides-
- Parallelization of tree construction
- Distributed Computing for training very large models
- Cache Optimization of data structures and algorithm

**Steps to Install
Windows**

XGBoost uses Git submodules to manage dependencies. So when you clone the repo, remember to specify –recursive option:

git clone --recursive https://github.com/dmlc/xgboost

For windows users who use github tools, you can open the git shell and type the following command:

git submodule init git submodule update

*OSX(Mac)*

First, obtain gcc-8 with Homebrew (https://brew.sh/) to enable multi-threading (i.e. using multiple CPU threads for training). The default Apple Clang compiler does not support OpenMP, so using the default compiler would have disabled multi-threading.

brew install gcc@8

Then install XGBoost with pip:

pip3 install xgboost

You might need to run the command with –user flag if you run into permission errors.

**Code: Python code for XGB Classifier**

`# Write Python3 code here ` `# Importing the libraries ` `import` `numpy as np ` `import` `matplotlib.pyplot as plt ` `import` `pandas as pd ` ` ` `# Importing the dataset ` `dataset ` `=` `pd.read_csv(` `'Churn_Modelling.csv'` `) ` `X ` `=` `dataset.iloc[:, ` `3` `:` `13` `].values ` `y ` `=` `dataset.iloc[:, ` `13` `].values ` ` ` `# Encoding categorical data ` `from` `sklearn.preprocessing ` `import` `LabelEncoder, OneHotEncoder ` `labelencoder_X_1 ` `=` `LabelEncoder() ` ` ` `X[:, ` `1` `] ` `=` `labelencoder_X_1.fit_transform(X[:, ` `1` `]) ` `labelencoder_X_2 ` `=` `LabelEncoder() ` ` ` `X[:, ` `2` `] ` `=` `labelencoder_X_2.fit_transform(X[:, ` `2` `]) ` `onehotencoder ` `=` `OneHotEncoder(categorical_features ` `=` `[` `1` `]) ` ` ` `X ` `=` `onehotencoder.fit_transform(X).toarray() ` `X ` `=` `X[:, ` `1` `:] ` ` ` `# Splitting the dataset into the Training set and Test set ` `from` `sklearn.model_selection ` `import` `train_test_split ` `X_train, X_test, y_train, y_test ` `=` `train_test_split( ` ` ` `X, y, test_size ` `=` `0.2` `, random_state ` `=` `0` `) ` ` ` `# Fitting XGBoost to the training data ` `import` `xgboost as xgb ` `my_model ` `=` `xgb.XGBClassifier() ` `my_model.fit(X_train, y_train) ` ` ` `# Predicting the Test set results ` `y_pred ` `=` `my_model.predict(X_test) ` ` ` `# Making the Confusion Matrix ` `from` `sklearn.metrics ` `import` `confusion_matrix ` `cm ` `=` `confusion_matrix(y_test, y_pred) ` |

*chevron_right*

*filter_none*

**Output**

Accuracy will be about 0.8645

## Recommended Posts:

- Boosting in Machine Learning | Boosting and AdaBoost
- LightGBM (Light Gradient Boosting Machine)
- ML - Gradient Boosting
- Difference between Batch Gradient Descent and Stochastic Gradient Descent
- XGBoost for Regression
- Comparison b/w Bagging and Boosting | Data Mining
- ML | Stochastic Gradient Descent (SGD)
- Optimization techniques for Gradient Descent
- Gradient Descent in Linear Regression
- ML | Mini-Batch Gradient Descent with Python
- Gradient Descent algorithm and its variants
- ML | Momentum-based Gradient Optimizer introduction
- Multivariate Optimization - Gradient and Hessian
- Difference between Gradient descent and Normal equation
- Vectorization Of Gradient Descent

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.