Training Models

3 / 57

Previous Index Next

Machine Learning Training Models Part-3

Slides

Download the slides

Previous Index Next

Please login to comment

27 Comments

Vivek Bohra

4 years ago

Suppose we have a dataset with let say 6 features.

x1, x2 and x3 features are proportional with degree 3

x4, x5 with degree 2

And x6 as y proprtional to sin(x6)

How will be able to find this type of relationship, in the data, just by using polynomial regression?

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

It will be difficult to find a relationship in this case, however, you can always normalize your data. Although, you do not need to. SImply feed the data to PolynomialFeatures and mention a degree, it will take care of the rest.

Thanks.

Upvote Share

This comment has been removed.

Manjari Singh

5 years ago

How can we decide how many number of variables to pass to linear regression class? though you explained, i did not find it straightforward , like what is the upper limit of number of coefficients, and then how do we even assume the degree, and then wait for some to become zero?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

The variables are the number of features. I would suggest you to go through the lecture videos once again to mitigate the knowledge gaps.

Thanks.

Upvote Share

This comment has been removed.

Visaal K.S

5 years ago

Can you please suggest me some books on preprocessing steps in machine learning like data cleaning and find correlations among the datasets

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Here are 8 books on data cleaning and feature engineering:

https://machinelearningmastery.com/books-on-data-cleaning-data-preparation-and-feature-engineering/

Thanks.

Upvote Share

Sameer Sippy

5 years ago

Dear CloudXLab Team,

Kindly keep us infomed in case if course contents, videos and PPTs are being revised again....There are many things to learn and takeaways with the revised content.

Upvote Share

Sameer Sippy

5 years ago

Dear Cloud X Lab Team,

Machine Learning Training Models Part-3 is refreshing to listen to and an eye-opener regarding our Mentor's learning methodologies. It wasn't there in the earlier videos...which have recently been revamped & modified..Anyways, I did make a point of gong through this recording once again...even though it is showing as completed....I need to look into Machine Learning Training Models Part-4 which is showing as unmarked---even though I had completed this module 100% earlier...Anyways, I'll go through this session shortly.

Yeah, if the Mentor can share the details of books for Statistics, ML & DL, via this forum would definitely be helpful....

Thanks in advance!!!

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Could you please post your request for names of books in our discussion forum?

Thanks.

Upvote Share

PRANAV KRISHNAN

5 years ago

1.poly_features = PolynomialFeatures(degree=2, include_bias=False) by this statement are we adding more features to existing data? By adding some features how the problem becomes a polynomial one?

2.suppose we are given with a regression problem in what case we will add polynomial features?how we will select the degree?

3.Do linear,lasso,ridge uses Normal equation to find the weights?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

1. PolynomialFeatures generate a new feature matrix consisting of all polynomial combinations of the features with degree less than or equal to the specified degree. For example, if an input sample is two dimensional and of the form [a, b], the degree-2 polynomial features are [1, a, b, a^2, ab, b^2].

2. You need to check how the model is performing, and based on the data. For details, I would suggest you to go through the lecture video once again.

3. It is given in detail in the lecture video itself how linear, lasso, and ridge regression functions. Would suggest you go through the same. You can also follow the below links if you want to know more:

https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html

https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html

Thanks.

1 Upvote Share

Sameer Sippy

5 years ago

For implementing a Ridge Regression, we use Gradient Descent (apart from computing a closed form equation) as another technique.

Do we have any examples for this in the Cloud X Lab files shared with us? Tried to go through this combo-example, but didn't come cross any such example. Could you guide/help me regarding the same?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Have you checked out the Regularized Models section of our Trainng Linear Models notebook?

Thanks.

Upvote Share

Sameer Sippy

5 years ago

Dear Cloud X Lab Team,

In your video regarding SGD, it is mentioned as:

Randomness in SGD ---
# Is good so as to escape from local minimum.
# But bad, as algorithm can never settle at the minimum.

Solutions for overcoming this deficiency are:

a) a) Simulated Anealing (also called as Learning Schedule.) ---as one of the solutions.

Which are the other techniques apart from Simulated Anealing to obtain local minimum value or a value close to that?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Please check the link below:

https://en.wikipedia.org/wiki/Simulated_annealing#Related_methods

Thanks.

Upvote Share

Sameer Sippy

5 years ago

Noted the link. Thanks Rajtilak for sharing it.

Upvote Share

Dhyey Kotecha

5 years ago

@disqus_XTh3bUKOBh,

I have few queries related to this session:

1. Ridge Regression: On slide # 220, the regularization term was defined as

. It seems like 1/2 is missing in that. Further, on slide # 226, the regularization term looks to miss out the 'alpha' .
I believe, the correct formula of the regularization term is . Please confirm the understanding.

2. Ridge Regression: At 2:07:10 in the video session, it was said that, the detailed step-wise working would be provided for calculating the partial derivative of the closed-form equation with regularization term added to it (ridge regression). Kindly share the same.

3. Early Stopping: In the video session at 2:39:40 and at slide # 260, it was mentioned that, the working is done basis Batch Gradient Descent, while the classifier used for the same is Stochastic Gradient Descent. Please clarify.

I would request your inputs for clarifying the above queries.

Thanks !

Upvote Share

CloudxLab

5 years ago

Hi,

1. That is the correct regularization term for Ridge Regression. It is also known as Tikhonov regularization. You can fine more details in the link below:

https://en.wikipedia.org/wi...

The formula on slide# 226 is also correct. However, I am unable to the 3rd screenshot you posted so would not be able to comment about it.

2. You can find more details from the below link:
https://stats.stackexchange...
3. Slide# 260 explains early stopping with Batch Gradient Descent instead of Mini-Batch. You can find more details from the link below:

https://en.wikipedia.org/wi...

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Anubhav Gupta

5 years ago

How can we know or how a model knows Which feature are useful or not for our predictions??

Upvote Share

CloudxLab

5 years ago

Hi,

To understand this you must understand what is a "useful" feature? It's a feature that your Machine Learning model can learn from in order to more accurately predict the value of your target variable. In other words, it's a feature that helps your model to make better predictions! So the algorithms are built to determine which features gives better predictions.
Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Anubhav Gupta

5 years ago

what do we mean by weights in the machine learning model in simple terms?

Upvote Share

CloudxLab

5 years ago

Hi,

Suppose a person has to take a decision to choose one way between two. So what will he do? He will choose one path after analyzing both. Analyzing will give some percentage to both path on the basis of experience. This percentage is weight in terms of Machine Learning.

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Sharathchandran Komravelli

5 years ago

are decision trees ,random forest regressor, are they training algorithms or hypothesis h^(x)

Upvote Share

CloudxLab

5 years ago

Hi,

These are training algorithms.

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Sharathchandran Komravelli

5 years ago

what is the difference between sgd regressor and sgd classifier

Upvote Share

Training Models

Machine Learning Training Models Part-3

Slides

XP

Please login to comment

27 Comments