- Home
- Assessment

86 / 94

So, I got the scores for `DecisionTreeRegressor`

as-

```
array([70072.56333434, 64669.76437454, 70664.61751498, 68361.78369658,
70788.86300501, 74769.9362526, 69933.57686858, 69833.39043083,
76381.61262044, 68969.41090616])
```

**Note-** You may have got different scores due to the stochastic nature of `cross_val_score()`

.

The scores represent the RMSE value of the model on the validation set on different runs. As we chose the value of `cv`

as 10, it contains 10 evaluation scores. The **mean** and **standard deviation** of the scores comes as- (70444.55190040627, 3078.3070579465134).

So, this is the mean RMSE value on the validation data set. Now, the `DecisionTreeRegressor`

doesn't look like a good fit. `DecisionTreeRegressor`

is overfitting so badly that it performs even worse than the `LinearRegression`

model as it had a lesser RMSE value than this. (You can try cross validating the `LinearRegression`

model in the same way as we did for `DecisionTreeRegressor`

. The mean **RMSE** of it will be most probably lesser than the `DecisionTreeRegressor`

)

So, when our Decision Tree model overfits, we use the Random Forest model. Random Forest trains several decision trees on random subsets of the features and averages out all their values while prediction and hence reducing overfitting by a much greater extent.

Refer to RandomForestRegressor documentation for further details about the estimator.

Import

`RandomForestRegressor`

from`sklearn.ensemble`

.Create an instance of the estimator with the name

`forest_reg`

.Fit the model on our training data

*i.e*. (`housing_prepared`

,`housing_labels`

).Predict the output from the model for our training predictors

*i.e*. (`housing_prepared`

) and store the output in a variable named`predictions`

.Calculate the

**RMSE**for our model`RandomForestRegressor`

between actual values (`housing_labels`

) and predicted values (`predictions`

) and store its value in a variable named`forest_rmse`

.Use

`cross_val_score`

function and provide`forest_reg`

as*estimator*,`housing_prepared`

and`housing_labels`

as*predictors_data*and*target_variable*,`neg_root_mean_squared_error`

as the*scoring*metric and*cv*as`10`

for parameters as we want to perform 10-fold cross-validation. Store the output in a variable named`scores`

.The scores will be negative. Pass them through

`abs()`

function to convert them in positives by-`scores = abs(scores)`

**Note-**It may take some time to cross validate the Random Forest model.

XP

Taking you to the next exercise in seconds...

Want to create exercises like this yourself? Click here.

Checking Please wait.

Success

Error

Fetching hint, please wait...

Error

Fetching answer, please wait...

Error

**Note - **Having trouble with the assessment engine? Follow the steps listed
here

## Loading comments...