Observing Random Forests

I got forest_rmse as 18641.94835072352. You may have got some other value due to the stochastic nature of Random Forest. It is much better than our two previous models.

Also, I got the scores for RandomForestRegressor as-

array([49365.63140746, 47726.0382676 , 49823.27396708, 52361.60100788,
   49640.9498092, 53619.30260892, 49001.1816177, 47790.01251808,
   53163.0727864, 49998.12196605])

Note- You may have got different scores due to the stochastic nature of cross_val_score().

The mean and standard deviation of the scores comes as- (50248.91859563662, 1991.9110531037534).

This looks much more promising than our previous two models. However, our model is still overfitting because training RMSE is much less than mean validation RMSE. You can try tuning different hyperparameters of Random Forest or constrain it to reduce overfitting.

But before you dig deeper into the Random Forests, you should try out many other models from various categories of Machine Learning algorithms (like Support Vector Machines, possibly a neural network, etc.), without spending too much time tweaking the hyperparameters. The goal is to shortlist a few (two to five) promising models.

Previous Index Next

End-to-End ML Project- Beginner friendly

Observing Random Forests

XP

Loading comments...