Login using Social Account
     Continue with GoogleLogin using your credentials
GridSearchCV
takes hyperparameters which we want to experiment with different values as input and evaluates all the possible combinations of the hyperparameters using cross-validation.
Suppose there is a model which has three hyperparameters a, b, and c. We want to try out our model on the hyperparameters values as-
a = [8,16,32,64]
b = [True, False]
c = [1,2,3]
and we specify cv
to 5, then
GridSearchCV
will try out 4 * 2 * 3 = 24 combinations. Some of the combinations will be - {a : 8, b: True, c:1}, {a : 16, b: True, c:1}, {a : 32, b: True, c:1}, {a : 8, b: True, c:2} and {a : 8, b: False, c:1}.
And, as we have specified cv
as 5, the total rounds of training will be 24 * 5 = 120. It means, our model will be trained 120 times. It may take a lot of time for training and this also limits us to try out a large number of combinations. Hence, we should do it carefully.
The syntax of GridSearchCV
is-
GridSearchCV(estimator, param_grid, cv = None, scoring = None)
where
estimator
is our model instance,
param_grid
is a dictionary that contains hyperparameter names as keys and their list of values which we want to try out as values of the dictionary,
cv
is the cross-validation parameter (k in k-fold) and
scoring
is our evaluation metric.
Then we can make it work by using the fit()
method.
Refer to GridSearchCV documentation for further details about the method.
Let's find out the best combination of hyperparameters for our RandomForestRegressor
.
Import sklearn.model_selection.GridSearchCV
.
Create a new instance of RandomForestRegressor
with the name reg_forest
. We create a new instance to start the training from a new phase as the old one forest_reg
is already trained.
Create a python dictionary with the name param_grid
and specify the key-value pairs as-
{'n_estimators': [3, 10, 30], 'max_features': [2, 4, 6, 8]}
Create an instance of GridSearchCV
with the name grid_search
. Specify estimator as reg_forest
. param_grid as our dictionary param_grid
, cv as 5
and scoring as neg_root_mean_squared_error
.
Fit grid_search
on our training dataset i.e. (housing_prepared
, housing_labels
).
Use the best_params_
attribute on grid_search
to find out the best combination and store it in a variable named best_param
. The syntax of displaying attribute value of an object is-
object_name.attribute_name
Taking you to the next exercise in seconds...
Want to create exercises like this yourself? Click here.
Loading comments...