Training Models

5 / 57

Machine Learning Training Models Part-5


Slides

Download the slides


No hints are availble for this assesment

Answer is not availble for this assesment

Please login to comment

43 Comments

Overall, the content is good. Just a constructive feedback: Some efforts can be made to improve the discussion in these recorded sessions. A lot of self talking happens by the instructor, leading to confusion at the listener's end. 

Mixing discussions in between the content discussion the flow of the learning. Questions can be taken at the end of each concept rather than doing it in between all the time.

 

Thanks

  Upvote    Share

Hi Jay,

Thank you for taking the time to provide valuable feedback on our recorded sessions. We genuinely appreciate your insights and constructive suggestions.

We acknowledge the importance of maintaining clarity and coherence throughout our discussions. Your observation about minimizing self-talk and optimizing the flow of learning resonates with our commitment to enhancing the learning experience for our audience.

Moving forward, we will work on structuring our sessions more effectively by consolidating questions and discussions at the end of each concept. This approach will streamline the learning process and help avoid any confusion that may arise from mixing discussions with content delivery.

Your feedback serves as a catalyst for continuous improvement, and we are grateful for your contribution to our journey of enhancing educational experiences.

Thank you once again for sharing your thoughts with us. We look forward to implementing these suggestions and providing an even better learning environment for our audience.

  Upvote    Share

HI,

i want to know in ridge how solver sag or saga is different from SGD because prdeicted results are different as these two algorithms supposed to be same. And same question regarding elasticnet regression and sgd with elastic penalty. Mentor has already dicussed this problem in part-5 video and has noted down in his to do list @49:00. Kindly let me know where has he disscused solution.

 1  Upvote    Share

Hi, 

#slide 240

cost function is MSE or MAE? I think in case lasso regression we are using l1 regularization so cost function should be MAE.

  Upvote    Share

Hi,

The l1 norm is for the weight vector of the regularization term. We would still use MSE as part of the cost function.

Thanks.

  Upvote    Share

Hi,

I guess one video is missing here , the part-4 . In which Regularisation and RIdge regression are discussed. Upload it. 

Thank you.

  Upvote    Share

Hi,

Ridge Regression is discussed at the beginning of this video itself. Could you please recheck.

Thanks.

  Upvote    Share

Hi,

We are following every video, after part-3 video , there is part-5 video and clearly certain sections are missing in the video. There is no introduction for regularisation and different types of regularisation. below is the link of the video that's missing. we checked it in youtube lecture series of CloudXLab.

Thank you.

https://www.youtube.com/watch?v=bq-HWk1w2wU&list=PLFhNzVKP1pVrNU8cTL_t-8YzPLF8i8PaS&index=23&ab_channel=CloudxLabOfficial

 3  Upvote    Share

Hi,

Thank you for bringing this to our notice.

Thanks.

  Upvote    Share

Is the characteristic of the curve between RMSE and no. of epochs always convex. Is there any chance that the curve may contain local minima?

  Upvote    Share

Hi,

As mentioned in slide# 80, the MSE cost function is a convex function with a global minima.

Thanks.

  Upvote    Share

Sir, 

Is it a good idea to always fit a higher degree polynomial rather than considering a simple model, every time we are doing linear regression? AND then If it is overfitting then bring it down by regularization.

  Upvote    Share

Hi,

Good question! The chances of overfitting are quite high if we use a higher degree polynomial, and it not that we use all of the features. The general method for linear regression is that, we need to understand the data and we achieve this using EDA(exploratory data analysis). We need to analyze the relevance of the features in determining the dependent variable and perform some featuring engineering and then perform modeling. Finally we measure the goodness of fit for the model and further investigate for performance improvment.

Thanks.

  Upvote    Share

Please make a seperate video for plotting graphs in python..it has not been covered ..and terms are confusing.

  Upvote    Share

Hi,

You could go through this for the same: https://cloudxlab.com/assessment/slide/480/getting-started-with-matplotlib

Hope this helps.

Thanks.

  Upvote    Share

so is J (theta)  for ridge, lasso and elasticNet

is another error function , which needs to be minimised? 

Just like cost function?

and what type of decision the developer makes, after seeing the plot of J(theta)?

  Upvote    Share

Hi,

J(theta) itself is the cost function. I would suggest you to go through the training materials in details.

Thanks.

  Upvote    Share

HI,

Requset to please clarify ridge_reg.predict([[1.5]])

  Upvote    Share

Hi,

We have created a model with Ridge Regression here and is predicting for 1.5.

Thanks.

  Upvote    Share

Hi, 

A suggestion - can we have a document(like cheat sheet) which have important points covering definitions and formulas. Which helps us to have quick look to recall and helps in future.

Thanks,

Vijay Kulkarni 

  Upvote    Share

Hi,

This is a good idea! Unfortuntely as of now we do not have such a cheat sheet available. However, we will keep this suggestion in mind and would try to incorporate this as and when possible.

Thanks.

  Upvote    Share

The code for plotting the chart presented in slide # 307, mentions the following line for computing the 'boundary'. Please help to decipher it.

boundary = -(log_reg.coef_[0][0] * left_right + log_reg.intercept_[0]) / log_reg.coef_[0][1]

 

  Upvote    Share

@CloudxLab Team --- I request your inputs on the above problem.

 

Aside to this, for the native comment system launched recently, I have few suggestions to improve it a bit further. It would be better if there is an option to edit the posted comment and also let users tag one another alike Disqus.

Thanks!

  Upvote    Share

Hi,

Thank you for you feedback. We will look into this.

Thanks.

  Upvote    Share

Hi,

I could not find this on slide# 307, could you please tell me which slide you are referring to?

Thanks.

  Upvote    Share

The said code is as per the GitHub repository of the course available here. The snapshot of the code is attached for quick reference. 

  Upvote    Share

Hi Dhyey,

Sandeep has already replied to your post in forum.

https://discuss.cloudxlab.com/t/classification-sgd-classifier-precision-recall/4659

Thanks.

  Upvote    Share

Hi,

Also regarding the formula, it is the formula for calculating the decision boundary of a logistic regression problem. Here, log_reg.coef_ is the coefficient of the logistic regression, log_reg.intercept_ is the intercept, and if you change the value of left_right (which is an array), you will notice how the width of the decision boundary changes.

Thanks.

  Upvote    Share

Hi! Rajtilak,

It seems like there is some understanding gap.

The post you have referred to relates to an altogether different query. The present question under consideration is about classification in Iris Dataset.

Just to reiterate that, the code for plotting the chart (as given in the GitHub repository) mentions an equation basis which calculation is made for the 'boundary' .The snapshot can be accessed using this link (as image uploading is not working now in comments).

boundary = -(log_reg.coef_[0][0] * left_right + log_reg.intercept_[0]) / log_reg.coef_[0][1]

I need your inputs on that part.

  Upvote    Share

Hi,

I have also replied to this above. Could you please check.

Thanks.

  Upvote    Share

I have got that. thanks !

  Upvote    Share

Hi All,

I am confused when looking to the exp^(-t) of sigmoid function. Can someone please let me know what is "t" in the sigmoid equation?
Like as we know X is the feature values ,theta is the wights and y is the actual value . So, My question is what is 't'? From where we will get value for t in sigmoid function?

sigma(t)= 1/(1+exp^(-t))

Thanks is advance.

Regards,
Jayant

  Upvote    Share

Hi All,

It has been elaborated in session only.

Thanks,
Jayant

  Upvote    Share

Hi,

Here t is x. If you look at the function, it is a function of t. Sigmoid is an activation function used in NN. It is one of the most widely used non-linear activation function. Sigmoid transforms the values between the range 0 and 1. A noteworthy point here is that unlike the binary step and linear functions, sigmoid is a non-linear function. This essentially means that when we have multiple neurons having sigmoid function as their activation function, the output is non linear as well.

Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share

Sir for classification problem we can use either logistic regression or SVM but better to choose SVM becoz of the margin its providing that will lead to better accuracy. sir please correct if i am wrong.

  Upvote    Share

Hi,

SVM tries to finds the “best” margin (distance between the line and the support vectors) that separates the classes and this reduces the risk of error on the data, while logistic regression does not, instead it can have different decision boundaries with different weights that are near the optimal point.

SVM works well with unstructured and semi-structured data like text and images while logistic regression works with already identified independent variables.

SVM is based on geometrical properties of the data while logistic regression is based on statistical approaches.

The risk of overfitting is less in SVM, while Logistic regression is vulnerable to overfitting.

We needs to keep these differences in mind while applying logistic regression or SVM.

Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share

The role of Activation function only is to bringing down the instances value between o and 1?? if this is so, then we can do that using feature scaling instead..

  Upvote    Share

Hi,

Activation function decides, whether a neuron should be activated or not by calculating weighted sum and further adding bias with it. The purpose of the activation function is to *introduce non-linearity* into the output of a neuron.

We know, neural network has neurons that work in correspondence of *weight, bias* and their respective activation function. In a neural network, we would update the weights and biases of the neurons on the basis of the error at the output. This process is known as *back-propagation*. Activation functions make the back-propagation possible since the gradients are supplied along with the error to update the weights and biases.

A neural network without an activation function is essentially just a linear regression model. The activation function does the non-linear transformation to the input making it capable to learn and perform more complex tasks.

Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share

Thanks.. now i got this concept ..

  Upvote    Share

In Logistic Regression why we use this command only please explain

X_new=np.linspace(0,3,1000).reshape(-1,1)

  Upvote    Share

Hi,

I am not able to find training_linear_models file in machine learning folder. can you please help me out. Thanks in advance

  Upvote    Share

It would be better to clone in your cloudxlab account:
git clone git@github.com:cloudxlab/ml.git

After clone you will find the notebook in jupyter:

Also, here is a direct link: https://github.com/cloudxla...

  Upvote    Share