# Tune ML Models in No Time with Optuna

This article was published as a part of the Data Science Blogathon

A comprehensive guide for finding the best hyper-parameter for your model efficiently.

Tuning hyperparameter is more efficient with Bayesian optimized algorithms compared to Brute-force algorithms.

You will see how to find the best hyperparameters for XGboost Regressor in this article.

## Introduction

Optimizing ML models is a very challenging job. There are many tools in the market that makes our fine-tuning easier, In order to optimize our model, you need to make sure that our data is meaningful and our ML model fits on it perfectly. In this article, we will see how to fine-tune an Xgboost regressor in the easiest way possible.

In machine learning, a hyperparameter is a parameter whose value is used to control the learning process. ie-LR, regularization, gamma, and so on. Hyperparameters’ values can’t be estimated from the data. These are parameters that control the learning of the model on our data.

You can control the learning parameters in-order to make the model perfectly fit and learn from our data and give skilful predictions. Hyperparameter optimization of ML models gets complicated as our model gets complex. there are some popular ways to find optimized hyperparameters.

**Grid Search:**Grid Search trains the model on all possible hyperparameter combination and then compare using cross-validation.**Randomized Search:**Randomized Search trains a model on a random hyperparameter combination and then compares using cross-validation.**Bayesian approach:**it uses the Bayesian technique to model the search space and to reach an

optimized parameter. There are many handy tools designed for fast

hyperparameter optimization for complex deep learning and ML models like

HyperOpt, Optuna, SMAC, Spearmint, etc.

**Optuna**

Optuna is the SOTA algorithm for fine-tuning ML and deep learning models. It depends on the Bayesian fine-tuning technique. It prunes unpromising trials which don’t further improve our score and try only that combination that improves our score overall.

**Salient Features of Optuna:**

- Efficiently search large spaces and prune unpromising trials for faster results.
- Parallelize hyperparameter searches over multiple threads or processes without modifying code.
- Automated search for optimal hyperparameters using Python conditionals, loops, and syntax.

## Tuning Process using Optuna

**Article Checkpoints:**

- Train Xgboost on the boston_housing dataset.
- Explore hyperparameter tuning in detail.
- Analyze the visualization for better and faster tuning.

## Load the Data and Libraries

Importing necessary modules and loading the data into a data frame.

import sklearn import pandas as pd import numpy as np from sklearn.datasets import load_boston boston = load_boston() df = pd.DataFrame(boston.data , columns = boston.feature_names) df['target'] = boston.target X = df.iloc[:,df.columns != 'target'] y = df.target

**load_boston()** returns the data in matrix form and after framing it into pandas data frame we separated X (features), y (target).

from sklearn.preprocessing import StandardScaler se = StandardScaler() X = se.fit_transform(X)

Scaling the dataset into one scale, helps the model to converge efficiently and split the dataset into train and test.

from sklearn.model_selection import train_test_split X_train, X_test, y_train,y_test = train_test_split(X, y, test_size = 0.2, random_state = 12)

## Before Tuning First Test the Base Model

Building a Vanilla (default) model before gives a scale for comparison,

and we can compare how much our model has improved after parameter

fine-tuning.

import xgboost as xgb from sklearn.model_selection import cross_val_score xg_reg = xgb.XGBRegressor() scores = cross_val_score(xg_reg, X_train,y_train , scoring = 'neg_root_mean_squared_error',n_jobs = -1,cv = 10) print(np.mean(scores), np.std(scores)) print(scores)

for some reason cross_val_score has “neg_root_mean_squared_error”, we can change this into positive by multiplying by -1.

- cross_val_score Evaluate a score by cross-validation.
`KFold`

creates a custom k cross-validation.- here scoring is “root_mean_squared_error” lower is better.
- cv = 10 folds , n_jobs= -1 ( using all available CPUs).

## Define Function that Takes Hyper-Parameters and Return Scores

We often calculate **rmse** in the regressor model and **AUC scores** for the classifier model.

def return_score(param): model = xgb.XGBRegressor(**param) rmse = -np.mean(model_selection.cross_val_score(model,X_train[:1000],y_train[:10000], cv = 4, n_jobs =-1,scoring='neg_root_mean_squared_error')) return rmse

**return_score() **takes parameters as keyword argument and returns the cross_validated value for -(**neg_root_mean_squared_error**), in order to save time, we are taking only 1000 samples.

## Import Optuna

This is the step where actual Optuna comes into the Picture.

import optuna from optuna import Trial, visualization from optuna.samplers import TPESampler

- Samplers class provides us with a class to define our hyper-parameter space.
- Visualization is used to plot the tuning results.
- A trial is a method of determining the effectiveness of an objective function.

#### Creating Hyper-Parameters Space

Time to create our main objective function, we put all the possible

hyper-parameter ranges and the function takes a trial object and should

return a score value.

We have various ways to define the ranges of possible hyper-parameter values:-

**trial.suggest_uniform(‘reg_lambda’,0,2)****trial.suggest_int(‘ParamName’, 2 , 5)****trial.suggest_loguniform(‘learning_rate’,0.05,0.5)**

def objective(trial): param = { "n_estimators" : trial.suggest_int('n_estimators', 0, 500), 'max_depth':trial.suggest_int('max_depth', 3, 5), 'reg_alpha':trial.suggest_uniform('reg_alpha',0,6), 'reg_lambda':trial.suggest_uniform('reg_lambda',0,2), 'min_child_weight':trial.suggest_int('min_child_weight',0,5), 'gamma':trial.suggest_uniform('gamma', 0, 4), 'learning_rate':trial.suggest_loguniform('learning_rate',0.05,0.5), 'colsample_bytree':trial.suggest_uniform('colsample_bytree',0.4,0.9), 'subsample':trial.suggest_uniform('subsample',0.4,0.9), 'nthread' : -1 } return(return_score(param)) # this will return the rmse score

For A detailed explanation on various parameters refer to this link.

#### Start Tuning

In order to start our searching of hyperparameters, we need to first create an **optuna**– study object, which holds all the information of hyper-parameters tried under that study.

The study creates an object which holds all the parameters, history about that particular optimization.

study1 = optuna.create_study(direction='minimize',sampler=TPESampler()) study1.optimize(objective, n_trials= 550,show_progress_bar = True)

**direction = ‘minimize’,***rmse*.**show_progress_bar = True,**gives a nice-looking progress bar.**n_trails = 1050**, is the same as the epoch.**sampler = TPESample()**, Bayesian Sampling Technique

Let’s Run the study !!

After 1050 trials we got our best hyper-parameters as shown below

In the beginning, we had

rmse2.98 and after 500 trials it became 2.54.

**How to get tuned hyper-parameters?**

The study object holds all the information of the particular search.

Calling **study.best_params** returns the optimized parameters in the form of a dictionary.

study1.best_params

## Visualization of Tuning

Visualizing search history gives us a better idea of the effect of

hyper-parameters on our model. we can also further narrow down the

search by using narrower parameters space.

optuna.visualization.plot_slice(study1)

These are the values of various hyper-parameters and their impact on our objective value (**rmse** in this case)

from the above graph, the minimum` `

found when **rmse****max_depth** was 3, **learning_rate** was .054, **n_estimators = 340**, and so on…

We can further deep dive and try to tune using narrower range of hyper-parameters values by taking help of visualizations.

## Compare the Tuned Model and Base Model

Let’s see how much our model has improved.

params = {} print(f"without tuning {return_score(params)}") print(f"with tuning {return_score(study1.best_params)}")

For the default model, we are not specifying any hyperparameters, the model will be using the default hyper-parameters, for the Optimized model we are passing **study1.best_params** which returns the optimized h-parameters in a form of a dictionary.

Perfect !! as you can see after fine-tuning we have got a lower` `

**rmse** by 30 units. It does not end, we can fine-tune further-more by narrowing the hyperparameters ranges.

## Conclusion

In this article we have seen how to optimize ML models, we saw how to find optimized parameters, but this is not the end. you can improve it further by narrowing down the learning parameter ranges.

In the next article, we will learn fine-Tuning of Xgboost Classifier using HyperOpt

Download full Code here.

*Thanks for reading the article, please share if you liked this article.*

**Article From:** Abhishek Jaiswal, Reach out to me on LinkedIn.