Open In App

Sklearn | Model Hyper-parameters Tuning

Last Updated : 16 Oct, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Hyperparameter tuning is the process of finding the optimal values for the hyperparameters of a machine-learning model. Hyperparameters are parameters that control the behaviour of the model but are not learned during training. Hyperparameter tuning is an important step in developing machine learning models because it can significantly improve the model’s performance on new data. However, hyperparameter tuning can be a time-consuming and challenging task. Scikit-learn provides several tools that can help you tune the hyperparameters of your machine-learning models. In this guide, we will provide a comprehensive overview of hyperparameter tuning in Scikit-learn.

What are hyperparameters?

Hyperparameters are parameters that control the behaviour of a machine-learning model but are not learned during training. Some common examples of hyperparameters include:

  1. Regularization strength: This parameter controls how much the model is penalized for overfitting.
  2. Number of trees: This parameter controls the number of trees in a random forest model.
  3. Learning rate: This parameter controls how quickly the model learns during training.

Why is hyperparameter tuning important?

Tuning hyperparameters is important because it can improve the performance of a training model on new data. For example, a poorly calibrated model will have high bias, meaning it is unsuitable for new data. On the other hand, a well-calibrated model will have bias and high variance, meaning it will extend well to new data and be accurate.

How to tune hyperparameters in Scikit-learn:

Scikit-Learn provides a variety of tools to help you tune the hyperparameters of your machine-learning models. A popular method is to use grid search.

GridSearch CV : Grid search is a brute force method that iterates through all possible combinations of hyperparameter values. You can implement grid search in scikit-learn using the GridSearchCV class. The GridSearchCV class defines a machine learning model and hyperparameter search space. A hyperparameter search space is a dictionary that defines the range of values ​​for each hyperparameter. The model is then evaluated on the delayed validation dataset. The combination of hyperparameters that best fit the data used was selected as the optimal model.

Another popular way to tune hyperparameters is to use random search.

Random Search : Compared to grid search, random search is a cheaper method because it tests only a random sample of hyperparameter values. You can implement random search in sci-kit-learn using the RandomizedSearchCV class. The RandomizedSearchCV class takes a machine-learning model and a hyperparameter distribution as input. A hyperparameter distribution is a dictionary that defines the distribution of values ​​to be tested for each hyperparameter. In the RandomizedSearchCV lecture, we train a machine learning program to randomly check hyperparameter values ​​in hyperparameter passes.

At this point, the demo is evaluated based on the delayed assertion data set. The combination of hyperparameters that achieves the best performance on the assertion dataset is selected as the key metric.

Advanced hyperparameter tuning techniques

In addition to grid search and random search, there are several other advanced hyperparameter tuning techniques that you can use in Scikit-learn. These techniques include:

  1. Bayesian optimization: Bayesian optimization is a sequential model-based optimization technique that can be used to search for the optimal hyperparameter values efficiently.
  2. Hyperband: Hyperband is a resource-efficient algorithm for hyperparameter tuning.
  3. Tree-structured Parzen estimator (TPE): TPE is a sequential model-based optimization technique often used to tune the hyperparameters of tree-based models.

Drawback of gridsearch cv:

  1. Computationally expensive: GridSearchCV searches for all combinations of hyperparameters in the grid. Therefore, it can be considered expensive, especially when the search area is large or samples are used.
  2. Comprehensive Search: GridSearchCV performs a comprehensive search on the grid parameter. This means that it evaluates all connections, even if some of them do not appear to improve performance standards. This may cause data loss.
  3. Not effective for large search space: When dealing with large search space or large number of hyperparameters, GridSearchCV does not work to scale due to large number of connections.
  4. Limited Exploration: GridSearchCV may not be able to explore the hyperparameter space like other search methods (such as random search). It does not provide much randomness in the search process and the hyperparameter space may not have an expectation space.
  5. Scalability Issues: GridSearchCV may not work well with some machine learning algorithms and large datasets. This may be impossible when dealing with big data.
  6. Will not change the results: GridSearchCV does not update its search based on the results of previous tests. It does not learn from the performance of previous hyperparameter combinations and may waste time on similar combinations or not match.
  7. Limited parallelization: GridSearchCV can be parallelized to some extent, but not all connections can be calculated at the same time. This limits its performance on multi-core processors or distributed computing environments.
  8. Does not solve the problem of model selection: GridSearchCV only focuses on hyperparameter modification and does not solve the problem of choosing different models or algorithms. Model selection often involves choosing from different types of machine learning, which GridSearchCV does not always support.

SVC Algorithm

GridSearchCV

Python3




# Import necessary libraries
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV
 
# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target
 
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42)
 
# Define the parameter grid to search over
param_grid = {
    'C': [0.1, 1, 10],
    'kernel': ['linear', 'rbf', 'poly'],
    'gamma': [0.1, 1, 'scale', 'auto'],
}
 
# Create an SVM classifier
svm = SVC()
 
# Create a GridSearchCV object
grid_search = GridSearchCV(
    estimator=svm, param_grid=param_grid, cv=5, n_jobs=-1)
 
# Fit the GridSearchCV object to the training data
grid_search.fit(X_train, y_train)
 
# Print the best hyperparameters and corresponding accuracy score
print("Best Hyperparameters: ", grid_search.best_params_)
print("Best Accuracy Score: {:.2f}%".format(grid_search.best_score_ * 100))
 
# Evaluate the model on the test set
best_svm = grid_search.best_estimator_
test_accuracy = best_svm.score(X_test, y_test)
print("Test Accuracy: {:.2f}%".format(test_accuracy * 100))


Output:

Best Hyperparameters:  {'C': 0.1, 'gamma': 0.1, 'kernel': 'poly'}
Best Accuracy Score: 95.83%
Test Accuracy: 100.00%
  • The output will display the best hyperparameters found during the grid search and the corresponding cross-validation accuracy score.
  • It will also show the accuracy of the best model on the test set.
  • The code is essentially performing hyperparameter optimization to find the best SVM model for the Iris dataset, and it reports the performance of the best model on unseen data.

Random search

Python3




import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV
from scipy.stats import uniform, expon
 
# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target
 
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42)
 
# Define the parameter grid for Grid Search
param_grid = {
    'C': [0.1, 1, 10],
    'kernel': ['linear', 'rbf', 'poly'],
    'gamma': [0.1, 1, 'scale', 'auto'],
}
 
# Define the parameter distributions for Random Search
param_dist = {
    'C': uniform(0.1, 10),
    'kernel': ['linear', 'rbf', 'poly'],
    'gamma': expon(scale=1),
}
 
# Create an SVM classifier
svm = SVC()
 
# Create a GridSearchCV object
grid_search = GridSearchCV(
    estimator=svm, param_grid=param_grid, cv=5, n_jobs=-1)
 
# Create a RandomizedSearchCV object
random_search = RandomizedSearchCV(
    estimator=svm, param_distributions=param_dist, n_iter=50, cv=5, n_jobs=-1)
 
# Fit the GridSearchCV object to the training data
grid_search.fit(X_train, y_train)
 
# Fit the RandomizedSearchCV object to the training data
random_search.fit(X_train, y_train)
 
# Print the best hyperparameters and corresponding accuracy score for Grid Search
print("Grid Search - Best Hyperparameters: ", grid_search.best_params_)
print("Grid Search - Best Accuracy Score: {:.2f}%".format(grid_search.best_score_ * 100))
 
# Print the best hyperparameters and corresponding accuracy score for Random Search
print("Random Search - Best Hyperparameters: ", random_search.best_params_)
print("Random Search - Best Accuracy Score: {:.2f}%".format(random_search.best_score_ * 100))
 
# Evaluate the best models on the test set
best_svm_grid = grid_search.best_estimator_
best_svm_random = random_search.best_estimator_
 
test_accuracy_grid = best_svm_grid.score(X_test, y_test)
test_accuracy_random = best_svm_random.score(X_test, y_test)
 
print("Test Accuracy (Grid Search): {:.2f}%".format(test_accuracy_grid * 100))
print("Test Accuracy (Random Search): {:.2f}%".format(test_accuracy_random * 100))


Output:

Grid Search - Best Hyperparameters:  {'C': 0.1, 'gamma': 0.1, 'kernel': 'poly'}
Grid Search - Best Accuracy Score: 95.83%
Random Search - Best Hyperparameters: {'C': 3.900736564361965, 'gamma': 0.4094567581571069, 'kernel': 'linear'}
Random Search - Best Accuracy Score: 96.67%
Test Accuracy (Grid Search): 100.00%
Test Accuracy (Random Search): 96.67%

The output will display the best hyperparameters found during grid search and random search, along with their corresponding cross-validation accuracy scores.

It will also show the accuracy of the best models found by both methods on the test set.

You can compare the performance of grid search and random search in finding the best hyperparameters for the SVM classifier.

XGBoost algorithm

GridSearchCV

Python3




import xgboost as xgb
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn import datasets
 
# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target
 
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
 
# Define the hyperparameters and their search ranges
param_grid = {
    'n_estimators': [100, 200, 300],
    'learning_rate': [0.01, 0.1, 0.2],
    'max_depth': [3, 4, 5],
    'min_child_weight': [1, 3, 5],
    'subsample': [0.8, 0.9, 1.0],
    'colsample_bytree': [0.8, 0.9, 1.0]
}
 
# Create an XGBoost model
xgb_model = xgb.XGBClassifier()
 
# Perform GridSearchCV
grid_search = GridSearchCV(xgb_model, param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)
 
# Get the best hyperparameters
best_params = grid_search.best_params_
 
# Fit the model with the best hyperparameters on the entire dataset
best_model = grid_search.best_estimator_
best_model.fit(X_train, y_train)
 
# Evaluate the best model on the test set
accuracy = best_model.score(X_test, y_test)
print(f"Best Hyperparameters: {best_params}")
print(f"Accuracy on test set: {accuracy:.2f}")


Output:

Best Hyperparameters: {'colsample_bytree': 1.0, 'learning_rate': 0.01, 'max_depth': 3, 'min_child_weight': 1, 'n_estimators': 200, 'subsample': 1.0}
Accuracy on test set: 1.00

In this output:

  • The best hyperparameters found by the grid search are listed.
  • The accuracy on the test set is also reported, indicating how well the best model performs on unseen data.
  • The goal of this code is to find the best hyperparameters for an XGBoost classifier and evaluate its performance on the test set

Random search

Python3




import xgboost as xgb
from sklearn.model_selection import RandomizedSearchCV, train_test_split
from sklearn import datasets
 
# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target
 
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
 
# Define the hyperparameter search space
param_dist = {
    'n_estimators': [100, 200, 300, 400, 500],
    'learning_rate': [0.01, 0.1, 0.2, 0.3, 0.4],
    'max_depth': [3, 4, 5, 6, 7, 8, 9, 10],
    'min_child_weight': [1, 3, 5, 7, 9],
    'subsample': [0.8, 0.9, 1.0],
    'colsample_bytree': [0.6, 0.7, 0.8, 0.9, 1.0],
    'gamma': [0, 0.1, 0.2, 0.3, 0.4],
    'lambda': [0, 0.1, 0.2, 0.3, 0.4]
}
 
# Create an XGBoost model
xgb_model = xgb.XGBClassifier()
 
# Perform RandomizedSearchCV
random_search = RandomizedSearchCV(xgb_model, param_distributions=param_dist, n_iter=100, cv=5, scoring='accuracy', random_state=42)
random_search.fit(X_train, y_train)
 
# Get the best hyperparameters
best_params = random_search.best_params_
 
# Fit the model with the best hyperparameters on the entire dataset
best_model = random_search.best_estimator_
best_model.fit(X_train, y_train)
 
# Evaluate the best model on the test set
accuracy = best_model.score(X_test, y_test)
print(f"Best Hyperparameters: {best_params}")
print(f"Accuracy on test set: {accuracy:.2f}")


Output:

Best Hyperparameters: {'subsample': 0.8, 'n_estimators': 200, 'min_child_weight': 1, 'max_depth': 7, 'learning_rate': 0.01, 'lambda': 0.3, 'gamma': 0.3, 'colsample_bytree': 0.9}
Accuracy on test set: 1.00

In this output:

  • The best hyperparameters found by the random search are listed.
  • The accuracy on the test set is also reported, indicating how well the best model performs on unseen data.
  • Randomized search is a more efficient way to explore hyperparameter space compared to grid search, especially when there are a large number of hyperparameters to consider.

Logistic regression algorithm

GridSearchCV

Python3




from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler
import warnings
 
# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target
 
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
 
# Scale the data using StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
 
# Define the hyperparameters and their search ranges
param_grid = {
    'C': [0.001, 0.01, 0.1, 1, 10, 100],
    'penalty': ['l2'],  # Only 'l2' penalty is compatible with 'lbfgs' solver
    'solver': ['liblinear', 'lbfgs']
}
 
# Create a Logistic Regression model
logistic_regression = LogisticRegression(max_iter=1000)
 
# Perform GridSearchCV with warnings filtered
with warnings.catch_warnings():
    warnings.filterwarnings("ignore", category=UserWarning)
    grid_search = GridSearchCV(logistic_regression, param_grid, cv=5, scoring='accuracy')
    grid_search.fit(X_train_scaled, y_train)
 
# Get the best hyperparameters
best_params = grid_search.best_params_
 
# Fit the model with the best hyperparameters on the entire dataset
best_model = grid_search.best_estimator_
best_model.fit(X_train_scaled, y_train)
 
# Evaluate the best model on the test set
accuracy = best_model.score(X_test_scaled, y_test)
print(f"Best Hyperparameters: {best_params}")
print(f"Accuracy on test set: {accuracy:.2f}")


Output:

Best Hyperparameters: {'C': 1, 'penalty': 'l2', 'solver': 'lbfgs'}
Accuracy on test set: 1.00

In this code:

  • The best hyperparameters are reported, including ‘C’, ‘penalty’, and ‘solver’.
  • The accuracy on the test set indicates how well the logistic regression model with the best hyperparameters performs on unseen data. In this case, it achieves an accuracy of 0.97 (97%).

Random search

Python3




from sklearn.model_selection import RandomizedSearchCV, train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler
import numpy as np
 
# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target
 
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
 
# Scale the data using StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
 
# Define the hyperparameter search space
param_dist = {
    'C': np.logspace(-4, 4, 100),  # Range of C values in logarithmic scale
    'penalty': ['l2'],  # Only 'l2' penalty is compatible with 'lbfgs' solver
    'solver': ['lbfgs'# Use only 'lbfgs' solver
}
 
# Create a Logistic Regression model
logistic_regression = LogisticRegression(max_iter=1000)
 
# Perform RandomizedSearchCV with error_score='raise'
random_search = RandomizedSearchCV(logistic_regression, param_distributions=param_dist, n_iter=100, cv=5, scoring='accuracy', random_state=42, error_score='raise')
random_search.fit(X_train_scaled, y_train)
 
# Get the best hyperparameters
best_params = random_search.best_params_
 
# Fit the model with the best hyperparameters on the entire dataset
best_model = random_search.best_estimator_
best_model.fit(X_train_scaled, y_train)
 
# Evaluate the best model on the test set
accuracy = best_model.score(X_test_scaled, y_test)
print(f"Best Hyperparameters: {best_params}")
print(f"Accuracy on test set: {accuracy:.2f}")


Output:

Best Hyperparameters: {'solver': 'lbfgs', 'penalty': 'l2', 'C': 0.6280291441834259}
Accuracy on test set: 1.00

In this code:

  • The best hyperparameters are reported, including ‘C’, ‘penalty’, and ‘solver’.
  • The accuracy on the test set indicates how well the logistic regression model with the best hyperparameters performs on unseen data. In this case, it achieves an accuracy of 0.97 (97%).

Conclusion

Hyperparameter tuning is an imperative step in machine learning show improvement. Tuning hyperparameters can essentially make strides demonstrate execution on modern information. Scikit-learn gives a few devices to assist you tune the hyperparameters of your machine learning demonstrate.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads