End-to-End ML Project- Beginner friendly

83 / 95

Training a powerful model

So if you would have trained the model and measured the RMSE correctly, then its value will be approximately 68628. It's a huge value for checking the range of the target variable. This is an example of model underfitting.

Underfitting refers to the model when it does not capture the underlying trend of the data. We can say that our model is underfitting when it fails to perform well on our training dataset. When this happens it can mean that the features do not provide enough information to make good predictions, or that the model is not powerful enough.

We can solve the problem of underfitting in 3 ways-

  1. Add more features to the dataset.

  2. Reduce the constraints on the model.

  3. Try a more complex model.

As there are no constraints on the model, we are left with two options. We'll first try a more complex model and if it doesn't work out, then we will go for creating more features.

We'll train a DecisionTreeRegressor. It is a powerful algorithm expertised in detecting complex non-linear trends in our data.

Refer to DecisionTreeRegressor documentation for further details about the model.

  1. Import DecisionTreeRegressor from sklearn.tree.

  2. Create an instance of the estimator with the name tree_reg.

  3. Fit the model on our training data i.e. (housing_prepared, housing_labels).

  4. Predict the output from the model for our training predictors i.e. (housing_prepared) and store the output in a variable named predictions.

  5. Calculate the RMSE for our model DecisionTreeRegressor between actual values (housing_labels) and predicted values (predictions) and store its value in a variable named tree_rmse.

Note - Having trouble with the assessment engine? Follow the steps listed here

Loading comments...