Ensemble Learning and XGBoost

2 / 42

Ensemble Learning Part -2


Slides

Download the slides


No hints are availble for this assesment

Answer is not availble for this assesment

Please login to comment

16 Comments

Hi,
For AdaBosst do we need to first perform standardization on training set or it will work on any type of data?

  Upvote    Share

Hi,

For AdaBoost, you can normalize the data to a scale of -1 to 1.

Thanks.

  Upvote    Share

Thank you,

One more doubt....If Ada Boost works for regression as well, how does it determine weights...because there are high chances that the predicted value will not meet the expected value.

Thanks in advance :)

  Upvote    Share

Hi,

AdaBoost is basically an ensemble of Decision Trees. So it calculates the weights for a regression task in the same way a Decision Tree does. You can go over the Decision Tree topic for a detailed explanation of the same.

Thanks.

 1  Upvote    Share

This comment has been removed.

Hi Team,

I got confused with term predictor and estimator. Please clear me if I am wrong anywhere:

  • predictor is basically the classifier that we are using like SVM, Random Forest etc.
  • estimator - (n_estimator) is the number of times the each predictor is used to predict the instance

Please explain use of n_estimator, I got confused with this now when after RandomForest topic.

Regards,

Birendra Singh

  Upvote    Share

Hi,

n_estimator is the number of trees in the forest.

Thanks.

  Upvote    Share

Hi
I am unable to download MNIST dataset either using fetch_mldata or fetch_openml. Plz help
Thanks

  Upvote    Share

Hi, Prachi.

You cam download it from the below.
from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784')
Kindly refer to the link for more information :- https://scikit-learn.org/st... <https: scikit-learn.org="" stable="" modules="" generated="" sklearn.datasets.fetch_openml.html="">
All the best!

-- Satyajit Das

  Upvote    Share

Hi
I have tried this already fetch_openml. This is not working.

  Upvote    Share

Hi
Does stacking is effective to increase accuracy level on unseen data?
Thanks

  Upvote    Share

Hi
Can we use gradient boosting classification as the same way we wre using gbrt here?
Thanks

  Upvote    Share

So in adaboost can we use different classifiers and regressors like in ensembling or we can just use single regressor and classifier?

  Upvote    Share

Yes, you are right.

AdaBoost can be used both for classification and regression problems. But there is a slight different implementations.

1) Multi-class AdaBoosted Decision Trees
2) Decision Tree Regression with AdaBoost

Kindly refer to the article for more details :- https://scikit-learn.org/st... https://scikit-learn.org/st...
All the best!

-- Satyajit Das

  Upvote    Share

In simple words the difference between the adaboosting and gradient boosting is that in adaboosting the errors by the predecessors and the corrected ones also can be passed on to another classifier but in gradient boosting only and only the errors will by the predecessors will be passed on to the successor classifier.
Isn't it??

  Upvote    Share

Here,

You are adding the serde json file .jar dependencies to your current program.

ADD JAR hdfs:///data/hive/json-serde-1.1.9.9-Hive13-jar-with-dependencies.jar;
Here, you are uploading the file "tweets_raw" which contains the data about sentiments from the hdfs location as mentioned below.

LOCATION '/user/YOUR_USER_NAME/SentimentFiles/SentimentFiles/upload/data/tweets_raw';
All the best!

-- Satyajit Das

  Upvote    Share