Project - Classify Clothes from Fashion MNIST Dataset using Machine Learning Techniques

25 / 30

End to End ML Project - Fashion MNIST - Fine-Tuning the Model - Grid Search - Tuning Hyperparameters

Let us now perform the Grid Search using the dimensionally reduced training dataset X_train_reduced.

Since our best model is Voting Classifier which is made up of two models Logistic Regression and Random Forrest. To do the grid search, we will have to supply the various values of parameters for both of the underlying models.

Since the grid search is a very intensive process, we are going to only try a handful of permutations because it would take a huge time otherwise.

NOTE:

In real-time scenarios, you might also like to train XGBoost and most likely, XGBoost would be the winning model. So, in those cases, you will have to do the hyperparameter training of XGBoost.

INSTRUCTIONS

Please follow the below steps:

Please import GridSearchCV, VotingClassifier, RandomForestClassifier, and LogisticRegression from SKLearn

from sklearn.ensemble import VotingClassifier
from sklearn.ensemble import <<your code goes here>>
from sklearn.linear_model import  LogisticRegression
from <<your code comes here>> import GridSearchCV

For logistic regression, we are going to try the following parameters:

multi_class:["multinomial"], solver:["lbfgs"], C:[5],

Please note that there is one combination.

For Random Forrest, we are going to try the following parameters:

n_estimators:[20],
max_depth:[10, 15],

Please note that there are basically two combinations one with max_depth 10 and other with max_depth 15.

In the parameter grid, we need to prefix the name of the parameter with the name of model followed by double underscores.

Please fill the right values from the various parameters mentioned above:

param_grid = [
    {
        "lr__multi_class":["multinomial"],
        "lr__solver":["lbfgs"],
        "lr__C":[<< YOUR CODE GOES HERE>>],
        "rf__n_estimators":[20],
        "rf__max_depth":[<< YOUR CODE GOES HERE>>],
    }]

Please create an instance of LogisticRegression with the paramters: multi_class="multinomial", solver="lbfgs", C=10 , random_state=42 and assign it to log_clf_ens .

log_clf_ens = << YOUR CODE GOES HERE>>(<< YOUR CODE GOES HERE>>)

Please create an instance of RandomForestClassifier with the paramters: n_estimators=20, max_depth=10 , random_state=42 and assign it to rnd_clf_ens .

rnd_clf_ens = <<YOUR CODE GOES HERE>>(<<YOUR CODE GOES HERE>>)

Please create an instance of VotingClassifier as done earlier and assign it to voting_clf_grid_search:

voting_clf_grid_search = <<YOUR CODE GOES HERE>>(
    estimators=[('lr', log_clf_ens), ('rf', rnd_clf_ens)],
    voting='soft')

NOTE: Please note the name lr given to the logistic regression model and rf given to Random Forrest model.

We will perform the Grid Search with 3 folds i.e. cv=3. Please create an instance of GridSearchCV called 'grid search' by passing following parameter values - voting_clf_grid_search, param_grid, cv=3 and scoring='neg_mean_squared_error'

grid_search = GridSearchCV(<<your code comes here>>)

Run the grid search on the 'reduced' training dataset X_train_reduced

grid_search.<<your code comes here>>(X_train_reduced, y_train)

Get the best hyperparameter values

grid_search.<<your code comes here>>

Get the best estimator

grid_search.<<your code comes here>>

Let's look at the score of each hyperparameter combination used during the grid search

cvres = grid_search.cv_results_
for mean_score, params in zip(cvres["mean_test_score"], cvres["params"]):
    print(np.sqrt(-mean_score), params)
Get Hint

Answer is not availble for this assesment


Note - Having trouble with the assessment engine? Follow the steps listed here

Loading comments...