It will be difficult to find a relationship in this case, however, you can always normalize your data. Although, you do not need to. SImply feed the data to PolynomialFeatures and mention a degree, it will take care of the rest.
How can we decide how many number of variables to pass to linear regression class? though you explained, i did not find it straightforward , like what is the upper limit of number of coefficients, and then how do we even assume the degree, and then wait for some to become zero?
Kindly keep us infomed in case if course contents, videos and PPTs are being revised again....There are many things to learn and takeaways with the revised content.
Machine Learning Training Models Part-3 is refreshing to listen to and an eye-opener regarding our Mentor's learning methodologies. It wasn't there in the earlier videos...which have recently been revamped & modified..Anyways, I did make a point of gong through this recording once again...even though it is showing as completed....I need to look into Machine Learning Training Models Part-4 which is showing as unmarked---even though I had completed this module 100% earlier...Anyways, I'll go through this session shortly.
Yeah, if the Mentor can share the details of books for Statistics, ML & DL, via this forum would definitely be helpful....
1.poly_features = PolynomialFeatures(degree=2, include_bias=False) by this statement are we adding more features to existing data? By adding some features how the problem becomes a polynomial one?
2.suppose we are given with a regression problem in what case we will add polynomial features?how we will select the degree?
3.Do linear,lasso,ridge uses Normal equation to find the weights?
1. PolynomialFeatures generate a new feature matrix consisting of all polynomial combinations of the features with degree less than or equal to the specified degree. For example, if an input sample is two dimensional and of the form [a, b], the degree-2 polynomial features are [1, a, b, a^2, ab, b^2].
2. You need to check how the model is performing, and based on the data. For details, I would suggest you to go through the lecture video once again.
3. It is given in detail in the lecture video itself how linear, lasso, and ridge regression functions. Would suggest you go through the same. You can also follow the below links if you want to know more:
For implementing a Ridge Regression, we use Gradient Descent (apart from computing a closed form equation) as another technique.
Do we have any examples for this in the Cloud X Lab files shared with us? Tried to go through this combo-example, but didn't come cross any such example. Could you guide/help me regarding the same?
1. Ridge Regression: On slide # 220, the regularization term was defined as
. It seems like 1/2 is missing in that. Further, on slide # 226, the regularization term looks to miss out the 'alpha'
. I believe, the correct formula of the regularization term is
. Please confirm the understanding.
2. Ridge Regression: At 2:07:10 in the video session, it was said that, the detailed step-wise working would be provided for calculating the partial derivative of the closed-form equation with regularization term added to it (ridge regression). Kindly share the same.
3. Early Stopping: In the video session at 2:39:40 and at slide # 260, it was mentioned that, the working is done basis Batch Gradient Descent, while the classifier used for the same is Stochastic Gradient Descent. Please clarify.
I would request your inputs for clarifying the above queries.
1. That is the correct regularization term for Ridge Regression. It is also known as Tikhonov regularization. You can fine more details in the link below:
The formula on slide# 226 is also correct. However, I am unable to the 3rd screenshot you posted so would not be able to comment about it.
2. You can find more details from the below link: https://stats.stackexchange... 3. Slide# 260 explains early stopping with Batch Gradient Descent instead of Mini-Batch. You can find more details from the link below:
To understand this you must understand what is a "useful" feature? It's a feature that your Machine Learning model can learn from in order to more accurately predict the value of your target variable. In other words, it's a feature that helps your model to make better predictions! So the algorithms are built to determine which features gives better predictions. Thanks.
Suppose a person has to take a decision to choose one way between two. So what will he do? He will choose one path after analyzing both. Analyzing will give some percentage to both path on the basis of experience. This percentage is weight in terms of Machine Learning.
Please login to comment
27 Comments
Suppose we have a dataset with let say 6 features.
x1, x2 and x3 features are proportional with degree 3
x4, x5 with degree 2
And x6 as y proprtional to sin(x6)
How will be able to find this type of relationship, in the data, just by using polynomial regression?
Upvote ShareHi,
It will be difficult to find a relationship in this case, however, you can always normalize your data. Although, you do not need to. SImply feed the data to PolynomialFeatures and mention a degree, it will take care of the rest.
Thanks.
Upvote ShareThis comment has been removed.
How can we decide how many number of variables to pass to linear regression class? though you explained, i did not find it straightforward , like what is the upper limit of number of coefficients, and then how do we even assume the degree, and then wait for some to become zero?
Upvote ShareHi,
The variables are the number of features. I would suggest you to go through the lecture videos once again to mitigate the knowledge gaps.
Thanks.
Upvote ShareThis comment has been removed.
Can you please suggest me some books on preprocessing steps in machine learning like data cleaning and find correlations among the datasets
Upvote ShareHi,
Here are 8 books on data cleaning and feature engineering:
https://machinelearningmastery.com/books-on-data-cleaning-data-preparation-and-feature-engineering/
Thanks.
Upvote ShareDear CloudXLab Team,
Kindly keep us infomed in case if course contents, videos and PPTs are being revised again....There are many things to learn and takeaways with the revised content.
Upvote ShareDear Cloud X Lab Team,
Machine Learning Training Models Part-3 is refreshing to listen to and an eye-opener regarding our Mentor's learning methodologies. It wasn't there in the earlier videos...which have recently been revamped & modified..Anyways, I did make a point of gong through this recording once again...even though it is showing as completed....I need to look into Machine Learning Training Models Part-4 which is showing as unmarked---even though I had completed this module 100% earlier...Anyways, I'll go through this session shortly.
Yeah, if the Mentor can share the details of books for Statistics, ML & DL, via this forum would definitely be helpful....
Thanks in advance!!!
Upvote ShareHi,
Could you please post your request for names of books in our discussion forum?
Thanks.
Upvote Share1.poly_features = PolynomialFeatures(degree=2, include_bias=False) by this statement are we adding more features to existing data? By adding some features how the problem becomes a polynomial one?
2.suppose we are given with a regression problem in what case we will add polynomial features?how we will select the degree?
3.Do linear,lasso,ridge uses Normal equation to find the weights?
Upvote ShareHi,
1. PolynomialFeatures generate a new feature matrix consisting of all polynomial combinations of the features with degree less than or equal to the specified degree. For example, if an input sample is two dimensional and of the form [a, b], the degree-2 polynomial features are [1, a, b, a^2, ab, b^2].
2. You need to check how the model is performing, and based on the data. For details, I would suggest you to go through the lecture video once again.
3. It is given in detail in the lecture video itself how linear, lasso, and ridge regression functions. Would suggest you go through the same. You can also follow the below links if you want to know more:
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html
Thanks.
For implementing a Ridge Regression, we use Gradient Descent (apart from computing a closed form equation) as another technique.
Do we have any examples for this in the Cloud X Lab files shared with us? Tried to go through this combo-example, but didn't come cross any such example. Could you guide/help me regarding the same?
Upvote ShareHi,
Have you checked out the Regularized Models section of our Trainng Linear Models notebook?
Thanks.
Upvote ShareDear Cloud X Lab Team,
In your video regarding SGD, it is mentioned as:
Randomness in SGD ---
# Is good so as to escape from local minimum.
# But bad, as algorithm can never settle at the minimum.
Solutions for overcoming this deficiency are:
a) a) Simulated Anealing (also called as Learning Schedule.) ---as one of the solutions.
Which are the other techniques apart from Simulated Anealing to obtain local minimum value or a value close to that?
Upvote ShareHi,
Please check the link below:
https://en.wikipedia.org/wiki/Simulated_annealing#Related_methods
Thanks.
Upvote ShareNoted the link. Thanks Rajtilak for sharing it.
Upvote Share@disqus_XTh3bUKOBh,
I have few queries related to this session:
1. Ridge Regression: On slide # 220, the regularization term was defined as
I believe, the correct formula of the regularization term is
2. Ridge Regression: At 2:07:10 in the video session, it was said that, the detailed step-wise working would be provided for calculating the partial derivative of the closed-form equation with regularization term added to it (ridge regression). Kindly share the same.
3. Early Stopping: In the video session at 2:39:40 and at slide # 260, it was mentioned that, the working is done basis Batch Gradient Descent, while the classifier used for the same is Stochastic Gradient Descent. Please clarify.
I would request your inputs for clarifying the above queries.
Thanks !
Upvote ShareHi,
1. That is the correct regularization term for Ridge Regression. It is also known as Tikhonov regularization. You can fine more details in the link below:
https://en.wikipedia.org/wi...
The formula on slide# 226 is also correct. However, I am unable to the 3rd screenshot you posted so would not be able to comment about it.
2. You can find more details from the below link:
https://stats.stackexchange...
3. Slide# 260 explains early stopping with Batch Gradient Descent instead of Mini-Batch. You can find more details from the link below:
https://en.wikipedia.org/wi...
Thanks.
-- Rajtilak Bhattacharjee
Upvote ShareHow can we know or how a model knows Which feature are useful or not for our predictions??
Upvote ShareHi,
To understand this you must understand what is a "useful" feature? It's a feature that your Machine Learning model can learn from in order to more accurately predict the value of your target variable. In other words, it's a feature that helps your model to make better predictions! So the algorithms are built to determine which features gives better predictions.
Thanks.
-- Rajtilak Bhattacharjee
Upvote Sharewhat do we mean by weights in the machine learning model in simple terms?
Upvote ShareHi,
Suppose a person has to take a decision to choose one way between two. So what will he do? He will choose one path after analyzing both. Analyzing will give some percentage to both path on the basis of experience. This percentage is weight in terms of Machine Learning.
Thanks.
-- Rajtilak Bhattacharjee
Upvote Shareare decision trees ,random forest regressor, are they training algorithms or hypothesis h^(x)
Upvote ShareHi,
These are training algorithms.
Thanks.
-- Rajtilak Bhattacharjee
Upvote Sharewhat is the difference between sgd regressor and sgd classifier
Upvote Share