Dimensionality Reduction Part -2 and Bayes Theorem

Please login to comment

38 Comments

Abolfazl Abdoli Arani

3 years ago

Hello
I did not notice that in slide 60
w2. , u, s, v
What are they?
Can you explain?

Upvote Share

Shubh Tripathi

3 years ago

Hi, You can refer to https://numpy.org/doc/stable/reference/generated/numpy.linalg.svd.html for the details

Upvote Share

Abolfazl Abdoli Arani

3 years ago

Thank you very much for your constant help

Upvote Share

Binita Sajit

4 years ago

After PCA , can we find out which features have been considered. If yes, then how can we find out

Upvote Share

Shashwat Verma

4 years ago

Hello,

Please clarify my doubts regarding the manifolds data.

1) During the initial data exploration of the given dataset how would be able to know whether the data is manifold?

2) During initial data exploration we generally do the scatterplot of data to visualize the spread and orientation of the data. So if we have to also check whether the data is manifold or not are there any steps that we need to include to arrive at this conclusion so that we exactly know what we need to do with the data?

Thanks

Upvote Share

Rajtilak Bhattacharjee

4 years ago

Hi,

Manifold Learning is a class of unsupervised estimators that seeks to describe datasets as low-dimensional manifolds embedded in high-dimensional spaces. So if you have a high-dimensional dataset, and you want to reduce the number of dimensions, then you can use the manifold learning classes. Now, to check the shape of the dataset, you can plot it using Matplotlib. The codes can be found on your Jupyter notebook for this course:

ml/dimensionality_reduction.ipynb at master · cloudxlab/ml (github.com)

Thanks.

Upvote Share

Reynold Barboza

5 years ago

Slides for dimensionality reduction arent accesible and not updated with archives. Please do so

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

I have updated the link, could you please check and let me know if it is working now.

Thanks.

Upvote Share

Reynold Barboza

5 years ago

I feel the archive page is incorrect at the end. Moreover the video has 105 slides and this ppt has 103.Can you please recheck?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

I have fixed this too, could you please recheck? As for a single place to download all slides, we do not have that facility as of now. So, would request you to download the slides as and when required.

Thanks.

Upvote Share

Reynold Barboza

5 years ago

Thanks but i still feel there's a problem in the slide.Pg 57 and 105 are the same and there isn't any archives page at 105

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

Slide# 104 onwards are the archive slides. The presentation may have changed slightly as we keep updating our course materials.

Thanks.

Upvote Share

Reynold Barboza

5 years ago

Yes i understand the files keep changing.What I'm trying to say is that archives are supposed to be references but page 105 is basically page 57.Please recheck this.

Thanks

Upvote Share

Reynold Barboza

5 years ago

Thhis is slide number 105 where archives should be a URL or related documents ?

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi,

You are correct! As of now we do not have any content for the archive section, so this is the complete set of slides.

Thanks.

Upvote Share

Reynold Barboza

5 years ago

Also is there any place we can get all the slides in one place for convenience ?

Upvote Share

Rakesh Kumar

5 years ago

Is there any WhatsApp group for CloudXLab Learners as mentioned by Sandeep sir in this session ?

Upvote Share

CloudxLab

5 years ago

Hi,

There is no WhatsApp group for learners as of now. However, you can post your queries here in your comments, in our discussion forum, or you can mail us at reachus@cloudxlab.com.

Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Prachi Singla

5 years ago

Hi
If manifold techniques are better than PCA, then should we always approach manifold techniques for dimentionality reduction?
Thanks

Upvote Share

CloudxLab

5 years ago

Hi,

Manifold is used in case on non-linear data. So, it depends on the data you are working with.

Thanks.

-- Rajtilak Bhattacharjee

1 Upvote Share

Prachi Singla

5 years ago

Hi
In KPCA , how we would choose gamma values in options for selecting best gamma in below code:
param_grid = [{
"kpca__gamma": np.linspace(0.03, 0.05, 10),
"kpca__kernel": ["rbf", "sigmoid"]}]

Upvote Share

CloudxLab

5 years ago

Hi,

One of the most common methods of model selection (in this case the parameter gamma) is Cross Validation. The idea is to hold a subset of your data that you will not use for training your algorithm, and then you will compare the cost functions associated with the two sets (training and CV) in order to find the “sweet spot” between a high variance and a high bias. You can find more details here:
https://stats.stackexchange...
Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Prachi Singla

5 years ago

Hi
Can Memory map be used in any kind of PCA implementation?

Upvote Share

CloudxLab

5 years ago

Hi,

Memory mapping and PCA are not related, but can be used in conjunction.
Thanks.

-- Rajtilak Bhattacharjee

1 Upvote Share

Prachi Singla

5 years ago

Hhi
In randomized PCA, there would be possibility of leaving some data behind during choosing random samples.So As compared to randomized it would be better to choose incremental PCA over batch PCA???

Upvote Share

CloudxLab

5 years ago

Hi,

Incremental principal component analysis (IPCA) is typically used as a replacement for principal component analysis (PCA) when the dataset to be decomposed is too large to fit in memory. IPCA builds a low-rank approximation for the input data using an amount of memory which is independent of the number of input data samples. It is still dependent on the input data features, but changing the batch size allows for control of memory usage. You can find more information here:
https://scikit-learn.org/st...
Thanks.

-- Rajtilak Bhattacharjee

Upvote Share

Prachi Singla

5 years ago

Hi
Can we have some mock interviews?
Thanks

Upvote Share

CloudxLab

5 years ago

Hi,

Thank you for your suggestion. We will look into this and will get back to you.

Thanks.

-- Rajtilak Bhattacharjee

2 Upvote Share

Prachi Singla

5 years ago

Thanks for considering. It would be really great if this can be arranged.Looking forward to it.
Thanks

Upvote Share

sasmita dash

5 years ago

Hi Prachi/ Cloudxlab team, I am also looking forward for mock interview session.
It will be of great help.

Best Wishes! Sasmita

Upvote Share

Rajtilak Bhattacharjee

5 years ago

Hi Sasmita,

As of now we do not have any provision for mock interview sessions.

Thanks.

Upvote Share

Rahul Singh

5 years ago

In slide no 61 of Dimensionality reductions, please explain that in "d= np.argmax(cumsum >= 0.95) + 1" what is the significance of "+1".

Upvote Share

Avishek Desarkar

5 years ago

i cant download the slides for Xgboost and Naives

Upvote Share

Vinod Kumar Jodu

5 years ago

When we used MNIST dataset to form an image we used 28*28, but when we dimesnionally reduced number of dimensions to 157. how we will form an image as like 28*28 for features of 784?

Upvote Share

Satyajit Das

5 years ago

Hi, Vinod.
Good question.

That is what is the concept of "PCA" Principal Component Analysis and Dimensionality Reductions.

Where by using the minimum number of the dimensions you will be able to retrieve the information with minimum number of components.

All the best

Upvote Share

Sindhu Tirth Sahoo

4 years ago

We can recover the data by say inverse transform of pca, if we have used for pca for dimensionality reduction. This will give us all the 784 features but will lose some information in the recovery process termed as reconstruction error

Upvote Share

Vinod Kumar Jodu

5 years ago

What is purpose of each dimension being orthogonal to each other. What happens if those dimensions are not orthogonal?

Upvote Share

Satyajit Das

5 years ago

Hi, Vinod.

If the dimensions are orthogonal to each other then you will be able to clearly distinguish between the components and the effect of one will not distort the other.

All the best!

1 Upvote Share