Decision Trees

You are currently auditing this course.
22 / 31

While making a prediction in Decision Tree, each node only requires checking the value of one feature


No hints are availble for this assesment

Answer is not availble for this assesment

Please login to comment

8 Comments

Wont each node check all the features and select one feature that minimises the impurity and then proceed further?

  Upvote    Share

Hi,

Each node checks only one feature at a time. Please go back to the video where this has been explained in detail.

Thanks.

  Upvote    Share

'One feature at a time' and not 'one feature only'. Am I correct?

  Upvote    Share

Hi,

By one feature at a time I meant one node checks only one feature, the next node another, and so on. Watch the video for a better understanding.

Thanks.

  Upvote    Share

Hi Team,

We have hyperparameter max_features. What I understood is at each node the split happens based on equal or less number of features then the set hyperparameter value for max_features. For e.g. in iris dataset if petal_lebth alone is not able to classify 2 classes separately , i think it can consider petal_width also for the same level of split (providided we have max_faetures =2).
So, why the answer to this question is True. It should be False right. One node can check for one or more feature depending  upon which gives less impurity. In short combinatio of features in node split is possible.

Regards,

Birendra Singh

  Upvote    Share

Hi,

Each node checks for one feature, and passes the rest to the next node. 

Max_feature is the number of features to consider each time to make the split decision. Let us say the the dimension of your data is 50 and the max_feature is 10, each time you need to find the split, you randomly select 10 features and use them to decide which one of the 10 is the best feature to use. When you go to the next node you will select randomly another 10 and so on. This mechanism is used to control overfitting. In fact, it is similar to the technique used in random forest, except in random forest we start with sampling also from the data and we generate multiple trees. So even if you set the number to 10, if you go deep you will end up using all the features, but each time you limit the set to 10. However, here we have not mentioned max_feature.

Thanks.

  Upvote    Share

Hi Rajtilak,

Thanks for explaination.So, my doubt now is everytime it will pick 10 random features out of 50 and while going forward it will not pick the previously selected features. Am I right ? Because this is the only way we will endup using all the features otheriwise there is probabilty of selecting the features in random selection which have been already picked does not guarntee using all features.

Regards.

Birendra Singh

  Upvote    Share

Hi,

So in a regular DT for every split, it considers all the features. With the max_features option, it looks at a reduced number of features and then follows the same patter as a regular DT. Except this time, it is computationally optimal since it does not have to look at all the features. But it does look at all features next time otherwise you will not include a number of features. By choosing a reduced number of features we can increase the stability of the tree and reduce variance and over-fitting.

Thanks.

  Upvote    Share