Dimensionality Reduction

You are currently auditing this course.
1 / 2

Dimensionality Reduction Part -1


Slides


No hints are availble for this assesment

Answer is not availble for this assesment

Please login to comment

19 Comments

I case of PCA I suppose we are fixing one axis then how it can have multiple orthogonal axis?

  Upvote    Share

Hi,

Principal Components Analysis chooses the first PCA axis as that line that goes through the centroid, but also minimizes the square of the distance of each point to that line. Thus, in some sense, the line is as close to all of the data as possible. Equivalently, the line goes through the maximum variation in the data.

Thanks.

  Upvote    Share

In case of image recognition ML problem the sparseness of data may be dependant on no. of dimensions. But in other cases say data collected on features of a certain product to build a ML model, if we reduce any particular dimention, the data collected corresponding to that dimension will also be removed. So, in this case how sparseness of data can be said to depend on no. of dimensions? I am confused because dimension reducttion also implies feature selection, so when a feature is removed the data collected agains it is also removed. Pl explain.

  Upvote    Share

Hi,

Dimensionality reduction refers to techniques that reduce the number of input variables in a dataset. More input features often make a predictive modeling task more challenging to model, more generally referred to as the curse of dimensionality. The performance of machine learning algorithms can degrade with too many input variables.

Fewer input dimensions often mean correspondingly fewer parameters or a simpler structure in the machine learning model, referred to as degrees of freedom. A model with too many degrees of freedom is likely to overfit the training dataset and therefore may not perform well on new data.

It is desirable to have simple models that generalize well, and in turn, input data with few input variables. This is particularly true for linear models where the number of inputs and the degrees of freedom of the model are often closely related.

Thanks.

  Upvote    Share

This comment has been removed.

Thanks for the detailed concept of dimensionality reduction and its importance in ML. But am I correct to assume that sparseness of data due to more no. of dimensions is particularly related to image recognition problems? In statistics we reduce the dimentions/features by factor analysis, where only most signicant features/dimentions are retained for final analysis.

  Upvote    Share

Hi,

Good question!

Dimentionality reduction has far more varied application than only for solving image related problems. For example, we can use it for spam classifiers.

Thanks.

  Upvote    Share

This comment has been removed.

Hi
Why we have added 1 to this expression while computing no. of features after applying PCA i.e. 'd = np.argmax(cumsum >= 0.95) +1'

  Upvote    Share

Hi,

This is because for visualization purposes it had to be reduced to 2 or 3.
Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share

Hi
I didnt get ur point of this reduction. Plz clarify.
Thanks

  Upvote    Share

Hi,

Can you try the code without adding the 1 and then share the output?

Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share

Hi
output is 153 features left.But if we add 1 it will show 154 features.

  Upvote    Share

Hi,

Please find the explanation here:
https://discuss.cloudxlab.c...
Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share

Could you please correct the notebooks where we do
from sklearn.datasets import fetch_mldata
?

there is always this error: ImportError: cannot import name 'fetch_mldata'

  Upvote    Share

Hi,

We have already updated our notebooks. Would request you to clone the updated files from our GitHub repository.

Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share

Thank you.

  Upvote    Share

the slides are not available.... please kindly fix the issue !
Thanks

  Upvote    Share

Hi Jean,

Apologies for the inconvenience. We have fixed the issue and now you would be able to view/download the slides.

Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share