Dimensionality Reduction

2 / 2

Dimensionality Reduction Part -2 and Bayes Theorem


Slides

Download the slides

Download the slides


No hints are availble for this assesment

Answer is not availble for this assesment

Please login to comment

38 Comments

Hello
I did not notice that in slide 60
w2. , u, s, v
  What are they?
Can you explain?

  Upvote    Share

Thank you very much for your constant help

  Upvote    Share

After PCA , can we find out which features have been considered. If yes, then how can we find out

  Upvote    Share

Hello,

Please clarify my doubts regarding the manifolds data.

1) During the initial data exploration of the given dataset how would be able to know whether the data is manifold?

2) During initial data exploration we generally do the scatterplot of data to visualize the spread and orientation of the data. So if we have to also check whether the data is manifold or not are there any steps that we need to include to arrive at this conclusion so that we exactly know what we need to do with the data?

Thanks

  Upvote    Share

Hi,

Manifold Learning is a class of unsupervised estimators that seeks to describe datasets as low-dimensional manifolds embedded in high-dimensional spaces. So if you have a high-dimensional dataset, and you want to reduce the number of dimensions, then you can use the manifold learning classes. Now, to check the shape of the dataset, you can plot it using Matplotlib. The codes can be found on your Jupyter notebook for this course:

ml/dimensionality_reduction.ipynb at master · cloudxlab/ml (github.com)

Thanks.

  Upvote    Share

Slides for dimensionality reduction arent accesible and not updated with archives. Please do so 

  Upvote    Share

Hi,

I have updated the link, could you please check and let me know if it is working now.

Thanks.

  Upvote    Share

I feel the archive page is incorrect at the end. Moreover the video has 105 slides and this ppt has 103.Can you please recheck?

  Upvote    Share

Hi,

I have fixed this too, could you please recheck? As for a single place to download all slides, we do not have that facility as of now. So, would request you to download the slides as and when required.

Thanks.

  Upvote    Share

Thanks but i still feel there's a problem in the slide.Pg 57 and 105 are the same and there isn't any archives page at 105

  Upvote    Share

Hi,

Slide# 104 onwards are the archive slides. The presentation may have changed slightly as we keep updating our course materials.

Thanks.

  Upvote    Share

Yes i understand the files keep changing.What I'm trying to say is that archives are supposed to be references but page 105 is basically page 57.Please recheck this.

Thanks

  Upvote    Share

Thhis is slide number 105 where archives should be a URL  or related documents ?

  Upvote    Share

Hi,

You are correct! As of now we do not have any content for the archive section, so this is the complete set of slides.

Thanks.

  Upvote    Share

Also is there any place we can get all the slides in one place for convenience ?

 

  Upvote    Share

Is there any WhatsApp group for CloudXLab Learners as mentioned by Sandeep sir in this session ?

  Upvote    Share

Hi,

There is no WhatsApp group for learners as of now. However, you can post your queries here in your comments, in our discussion forum, or you can mail us at reachus@cloudxlab.com.

Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share

Hi
If manifold techniques are better than PCA, then should we always approach manifold techniques for dimentionality reduction?
Thanks

  Upvote    Share

Hi,

Manifold is used in case on non-linear data. So, it depends on the data you are working with.

Thanks.

-- Rajtilak Bhattacharjee

 1  Upvote    Share

Hi
In KPCA , how we would choose gamma values in options for selecting best gamma in below code:
param_grid = [{
"kpca__gamma": np.linspace(0.03, 0.05, 10),
"kpca__kernel": ["rbf", "sigmoid"]}]

  Upvote    Share

Hi,

One of the most common methods of model selection (in this case the parameter gamma) is Cross Validation. The idea is to hold a subset of your data that you will not use for training your algorithm, and then you will compare the cost functions associated with the two sets (training and CV) in order to find the “sweet spot” between a high variance and a high bias. You can find more details here:
https://stats.stackexchange...
Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share

Hi
Can Memory map be used in any kind of PCA implementation?

  Upvote    Share

Hi,

Memory mapping and PCA are not related, but can be used in conjunction.
Thanks.

-- Rajtilak Bhattacharjee

 1  Upvote    Share

Hhi
In randomized PCA, there would be possibility of leaving some data behind during choosing random samples.So As compared to randomized it would be better to choose incremental PCA over batch PCA???

  Upvote    Share

Hi,

Incremental principal component analysis (IPCA) is typically used as a replacement for principal component analysis (PCA) when the dataset to be decomposed is too large to fit in memory. IPCA builds a low-rank approximation for the input data using an amount of memory which is independent of the number of input data samples. It is still dependent on the input data features, but changing the batch size allows for control of memory usage. You can find more information here:
https://scikit-learn.org/st...
Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share

Hi
Can we have some mock interviews?
Thanks

  Upvote    Share

Hi,

Thank you for your suggestion. We will look into this and will get back to you.

Thanks.

-- Rajtilak Bhattacharjee

 2  Upvote    Share

Thanks for considering. It would be really great if this can be arranged.Looking forward to it.
Thanks

  Upvote    Share

Hi Prachi/ Cloudxlab team, I am also looking forward for mock interview session. 
It will be of great help.

Best Wishes! Sasmita

  Upvote    Share

Hi Sasmita,

As of now we do not have any provision for mock interview sessions.

Thanks.

  Upvote    Share

In slide no 61 of Dimensionality reductions, please explain that in "d= np.argmax(cumsum >= 0.95) + 1" what is the significance of "+1".

  Upvote    Share

i cant download the slides for Xgboost and Naives

  Upvote    Share

When we used MNIST dataset to form an image we used 28*28, but when we dimesnionally reduced number of dimensions to 157. how we will form an image as like 28*28 for features of 784?

  Upvote    Share

Hi, Vinod.
Good question.

That is what is the concept of "PCA" Principal Component Analysis and Dimensionality Reductions.

Where by using the minimum number of the dimensions you will be able to retrieve the information with minimum number of components.

All the best

  Upvote    Share

We can recover the data by say inverse transform of pca, if we have used for pca for dimensionality reduction. This will give us all the 784 features but will lose some information in the recovery process termed as reconstruction error

  Upvote    Share

What is purpose of each dimension being orthogonal to each other. What happens if those dimensions are not orthogonal?

  Upvote    Share

Hi, Vinod.

If the dimensions are orthogonal to each other then you will be able to clearly distinguish between the components and the effect of one will not distort the other.

All the best!

 1  Upvote    Share