Course on Machine Learning - May 13, 2018

18 / 24

Previous Index Next

Support Vector Machines Part-2

Slides

Download the slides

Previous Index Next

Please login to comment

32 Comments

Jay Shah

a year ago

i came across this method of plotting decision boundaries form the SVM documentation, hope it might be useful for some of you. It gets rid of writing functions for plotting decision boundaries. here is an example:

model = SVC(kernel = 'rbf', gamma = 0.1, C=1000)

classifier = model.fit(X,y)

disp = DecisionBoundaryDisplay.from_estimator(classifier, X, response_method="predict",xlabel='x1', ylabel='x2',cmap=plt.cm.brg, alpha = 0.1)

disp.ax_.scatter(X[:, 0], X[:, 1], c=y, edgecolor="k")

Upvote Share

Nirav Raj

4 years ago

I did not understand the article 171

How can i define C=100 using SVM regression model??

Upvote Share

Vagdevi K

4 years ago

Hi,

It is a regularization hyperparameter. Smaller c penalizes less and larger penalizes more.

Thanks.

Upvote Share

Nirav Raj

4 years ago

When linear SVC and SVC class is used??

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

Good question!

Please go through the below link to understand the difference between SVC and Linear SVC:

https://stackoverflow.com/questions/45384185/what-is-the-difference-between-linearsvc-and-svckernel-linear

Thanks.

Upvote Share

Nirav Raj

4 years ago

i did not understand at 28:40

plot_predictions(poly_kernel_svm_clf) [-1.5,2.5,-1,1.5]

i did not understand what is this??

Where these value came from??

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

This is a function which plots the predictions of a model.

Thanks.

Upvote Share

Shivendra Verma

4 years ago

One Humble Request to instructor.. Please do not mix Eating, Drinking with Teaching and Learning. If it is unavoidable please have a break or mute yourself and finish all eating and drinking and start fresh class.

1 Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

Thank you for your feedback.

Thanks.

Upvote Share

Abhishek Raj Bhanu

4 years ago

While perfoming GridSerachCV which scoring method should be considered & in case of Both SVC and SVR what is the loss funtion to estimate error?

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

About the scoring method, it depends on the problem at hand. You can find more about it here:

https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html

SVC and SVR both has in-built loss functions, you can find more about them here:

https://towardsdatascience.com/optimization-loss-function-under-the-hood-part-iii-5dff33fa015d

Thanks.

Upvote Share

Aman Garg

5 years ago

sir,

I have completed topic 1,3,5 then why it is showing this ?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

You are yet to complete the following:

Topic# 1: Assessment# 36

Topic# 3: Assessment# 13

Topic# 5: Assessment# 5

Thanks.

Upvote Share

ABIJITH T A

5 years ago

Hi,

I believe a portion of the previous class is missing(on the topics SVC Polynomial Kernel + StandardScaler &SVC RBF Kernel + StandardScaler)

Could you please add the missing portion of the lecture or suggest some resources to read about the same?

Upvote Share

Sachin Shastri

5 years ago

Yes, same here.

Upvote Share

Anupam Singh Vishal

5 years ago

Hi Abijith,

Thanks for your feedback, the videos are updated and the sections are added.

Happy Learning.

Upvote Share

Queen Saikia

5 years ago

In the above video at point 1hr 53min, decision tree after the 1st division based on petal length, the value of the right part is[0,50,50]- it has 50 versicolor and 50 virginica but how it falls under class versicolor

so for same no. of features how to decide to which class we should place

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

That is a good questions but hard to visualize it is based on the Gini-index and optimizations of the cost of nodes. Let me try to explain it.

1) Splitting a dataset means separating a dataset into two lists of rows given the index of an attribute and a split value for that attribute.

2) Once we have the two groups, we can then use our Gini score to evaluate the cost of the split, checking if the attribute value is below or above the split value and assigning it to the left or right group respectively.

3) Once the best split is found, we can use it as a node in our decision tree.

This is an exhaustive and greedy algorithm.

4) The value of that attribute by which to split and the two groups of data split by the chosen split point and it will automatically assign the attributes to that group.

This is a clearly explained in the lecture.

You can refer to this article more intuitions : - https://machinelearningmastery.com/implement-decision-tree-algorithm-scratch-python/

for implementations : https://scikit-learn.org/stable/modules/tree.html#

All the best!

Upvote Share

Queen Saikia

5 years ago

Hi Sir,
In this video at the position 49mins and 15 seconds , one question has been asked -- "if there are 50 features then how many weights will there be?" So what will be the answer and how we can calculate that please explain?

Upvote Share

Siddharth

5 years ago

There is a huge disconnect in where Session16 video ended and from where Session17 video is starting.

I have gone through the session-16 thoroughly, and it ends at Slide115(as per the PDF) where we discussed the code for moons data.

Session 17's video should start from Slide117 - Polynomial Kernel, but it straight away jumps to Slide143 Computational complexity

What happened to discussion on the slides in between ? Pease help on that ?. I am losing my whole continuity with SVM discussions.

Upvote Share

This comment has been removed.

Sameer Sippy

5 years ago

Yes Siddhart there is a huge disconnect/gap betwen Sessions 16 & 17.

Session 16 ends abruptly at Polynomial Features + Standard Scaler + Linear SVC being the last poit of discussion followed by the codes in Jupyter notebook.

Session 17 skips the 2 concepts viz. :

b) SVC Polynomial Kernel + StandardScaler
c) SVC RBF Kernel + StandardScaler

(Both forming a part of Non-Linear Classificaton techniques)

Session 17 skips the aforesaid concepts and directly starts with SVM Regression,rather to be frank.

These can be pin-pointed as I have made notes of the explanation given by our faculty - Sandeep. In many lectures, the concepts are precise and crisp. But ya, there are slight deviations from the main concepts while explaining SVM concept and moreover the error in some SVM codes in the official Jupyter notebook (as seen in Lecture 16) and therefore his may have skipped the Faculty/Trainer's attention.

Also the important concepts viz. Kernel & Gamma explanation iare missing. Kindly correct me if I am wrong. .

Can someone explain me or provide me theoretcal explantion regarding the aforesaid concepts? Somehow the codes could be managed by referring to Jupyter Notebooks.

Kindly request Cloud X Team to look into this wide gap of topic explanation between Session 16 & Session 17. by providing Notes as the Reference Material in lieu of the gap that is found to be missing. Hoping that Cloud X Lab Team would respond favourably..

Upvote Share

Anupam Singh Vishal

5 years ago

Hi Sameer,

Thanks for your feedback, the videos are updated and the sections are added.

Happy Learning.

Upvote Share

Anupam Singh Vishal

5 years ago

Hi Siddharth,

Thanks for your feedback, the videos are updated and the sections are added.

Happy Learning.

Upvote Share

Prachi Singla

5 years ago

Hi
I am applying SVM in my dataset which has 9 features but on every kernel trick, my jupyter notebook kernel goes busy and does not respond Why this modelling affecting the kernel.Why this is not responding?plz help

Upvote Share

CloudxLab

5 years ago

Hi,

Would request you to let us know which dataset are you working with, what is the shape of the dataset, also send us a screenshot of your code.

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Prachi Singla

5 years ago

Hi
I am working on dataset having shape (80000,9) but when i tried svm for this binary classification problem, it didnt apply on it and my kernel just goes busy .
Thanks

Upvote Share

CloudxLab

5 years ago

Hi,

Would request you to restart your server using all the steps mentioned in the below link:
https://discuss.cloudxlab.c...
Also, would request you to share a screenshot of your code.

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Prachi Singla

5 years ago

hi
I want to know that how we will figure out that whether our data is linearly separable or non-lineraly. How do we decide which kernel we have to apply to our dataset.Plz help
Thanks

Upvote Share

Sindhu Tirth Sahoo

4 years ago

For this you'll have to do some EDA (Exploratory Data Analysis) and plot some visualisations of the dataset, do some univariate or bivariate analysis, etc.

1 Upvote Share

Ishvina Kapoor

5 years ago

for the decision trees topic when I am converting .dot output file to .png , the command is showing an Error: dot: can't open Tpng , and the image also is not loading , it is getting displayed in the files but .

Upvote Share

CloudxLab

5 years ago

Hi,

You need to mention the path along with the file name.

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Course on Machine Learning - May 13, 2018

Support Vector Machines Part-2

Slides

XP

Please login to comment

32 Comments