End-to-End ML Project- Beginner friendly

75 / 95

Creating Custom Transformers

Note- Please make sure that you have a basic knowledge of Python OOPs before proceeding further. If not, then refer to Python classes and objects.

Step 1-

First, we have to import the classes BaseEstimator and TransformerMixin from sklearn.base. Also, import library numpy with the alias_name np.

Step 2-

We create a class named CombinedAttributesAdder and add classes BaseEstimator and TransformerMixin as base classes to it. The base class is also known as the parent class.

Step 3-

We add a constructor to our class and also specify a default parameter add_bedrooms_per_room and set its default value as True. This variable tells us whether to create the attribute bedrooms_per_room or not.

Also, initialize an instance variable named add_bedrooms_per_room and set its value equal to the parameter add_bedrooms_per_room inside the constructor. Remember, the instance variable and method parameter, both have the same name.

Step 4-

Add the fit() method in the class. Specify two parameters in the method definition- X and y. Here, X is the attributes, and y is the target variable or labels. Set the default value of the parameter y as None.

Then add the following code inside the fit() method-

return self

We do nothing in the fit() method here because we are creating this class for creating new attributes. For that, we don't have to learn anything from our dataset so we perform no task in it. But to make a class compatible with sklearn, it's necessary to implement the fit() method inside it.

Step 5-

Add the following code in the class CombinedAttributesAdder to create a transform method and add the attributes to the dataset.

def transform(self, X, y=None):
    rooms_per_household = X[:, 3] / X[:, 6]
    population_per_household = X[:, 5] / X[:, 6]

    if self.add_bedrooms_per_room:
        bedrooms_per_room = X[:, 4] / X[:, 3]

        return np.c_[X, rooms_per_household, population_per_household, bedrooms_per_room]

        return np.c_[X, rooms_per_household, population_per_household]

Step 6-

Create an object of the class CombinedAttributesAdder with the name attr_adder and specify the parameter add_bedrooms_per_room as False.

Step 7-

Use the transform() method on our object attr_adder and specify the dataset as train_data.values to it. Store the output in a variable named housing_extra_attribs.


Peform the steps as mentioned above.

Note - Having trouble with the assessment engine? Follow the steps listed here

Loading comments...