End-to-End ML Project- Beginner friendly

You are currently auditing this course.
79 / 94

Exploring models

Now as we have created the pipeline, we can transform any data automatically by using it. With it, our data preprocessing step comes to an end. Now let's start the fifth step for which we have done all the preprocessing i.e. Explore Machine Learning models.

We'll use models already implemented in sklearn. The steps of using any sklearn model are the same. The steps to be followed are-

  • Create an instance of the estimator. We can do that by using the syntax-

    model_name = Model(parameters)
    

    where

    Model is the ML model or estimator which is a python class (for example, LinearRegression),

    model_name is the name of the model or more clearly we can say that it is an instance of the class Model and

    parameters are the parameters that are generally different for different models.

  • Then we use the fit() method on our instance model_name to fit the model on our dataset and learn several patterns from it. This step is also called training the model. We can do it as-

    model_name.fit(X_train,y_train)
    

    where

    model_name is the name of the instance that we created and X_train is the training dataset(attributes or predictors) and y_train is the training labels(target variable).

In our case, X_train is housing_prepared and y_train is housing_labels. Remember this throughout the step.

  • At last, we predict the output from our model by using the predict() method on our instance model_name using the syntax-

    model_name.predict(X)
    

    where X is the predictors(attributes or features) dataset i.e. the data for which we want to predict the target variable. We generally store the predictions in an array.

For example, we can fit the model LinearRegression on our training dataset and then perform prediction from it like-

lin_reg = LinearRegression()
lin_reg.fit(housing_prepared, housing_labels)
predictions = lin_reg.predict(housing_prepared)

Here, predictions will be a numpy array containing predictions done by our model on the training dataset.


No hints are availble for this assesment

Answer is not availble for this assesment

Loading comments...