End-to-End ML Project- Beginner friendly

50 / 95

Generating test set- Splitting

Now, as we have created an object of StratifiedShuffleSplit, let's start generating the test set.

At first, we use the split() method of the class StratifiedShuffleSplit. Its syntax is-


where X is the dataset and y is an array-like on the basis of which stratification has to be done.

  1. Call the split() method on the object split_object.Specify the dataset and stratification array in the arguments. Remember, we have to perform stratification according to the variable income_cat.
  2. The split() method returns a generator object. So, store it in a variable gen_obj. This generator object contains indexes of instances belonging to the training and testing dataset.

Note - Having trouble with the assessment engine? Follow the steps listed here

Loading comments...