Convolutional Neural Nets

2 / 3

Previous Index Next

Convolutional Neural Network - Session 02

Previous Index Next

Please login to comment

54 Comments

Birendra Singh

4 years ago

Hi Team,

In LeNet architecture, the maps columns has values 6,6,16,16 and so on. By map, what I understood from tutorial is number of filters, but nowhere we defined what kind of filters they are.For instance, in the example of china.jpg image, we took horizontal and vertical filters. Can, you please clear me what kind of filters they are and is there any calculation for defining the numbers ?

And at the ouput , size is 10 and rest of the parameters become null. So, is this a normal ANN where previous layers neurons are connected to 10 ouput layer. And why 10, is there are we classify 10 different classes of images here because in ANN we saw, number of neurons in output layer is equal to number of classes.

Please clarify my above doubts.

Regards,

Birendra Singh

Vagdevi K

4 years ago

Hi,

It is the number of filters. They act as feature extractors like detecting edges, colors, shades, etc in earlier layers of the network, and the deeper filters act as feature extractors on top of the basic features extracted in the earler layers. So training means modifying the values of filters such that the final result is close to actual label.

In the final layer, we have dense layers to output the probability of the image belonging to each class.

Thanks.

Birendra Singh

4 years ago

Hi Team,

Number of output features is equal to the number of filters ryt ? So, when you say 5x5 ouputting 200 feature maps of size 150x100 then it means out of 150X100 image we are getting 200 different features by applying 200 different filters of size 5x5 ryt.

And does the value of bias depends upon the number of strides ?

Regards,

Birendra Singh

Vagdevi K

4 years ago

Hi,

Just like in ANN, bias and filter values are something which we optimize through training.

Thanks.

This comment has been removed.

Gaurav Karki

4 years ago

Hi,

Pooling confers stability to deformation at initialization but the stability changes significantly over the course of training and converges to a similar stability regardless of whether pooling is used. At slide, 95, In case of Alexnet Pooling is used for C1 and C3. not for rest C layers due to my first statement.

My question is how would I come to know that these many pooling layers are required to get deformation stability?

Thanks!

Rajtilak Bhattacharjee

4 years ago

Hi,

Unfortunately, there are no hard and fast rules for choosing the number of layers. There is no one-size-fits-all when it comes to ML/DL. It would vary with each dataset/problem at hand.

Thanks.

Gaurav Karki

4 years ago

Hi,

At slide no. 41, second statement CNN is capable of detecting multiple features anywhere in its inputs.

if i apply rotation in image position of specific feature will change. and limitation of the feature map output of convolutional layers is that they record the precise position of features in the input. This means that small movements in the position of the feature in the input image will result in a different feature map. So one feature map will have different parameters for before rotation and after rotation image. Then how would we say CNN will detect object correctly for both image.

I think there is gap in my understanding. Kindly fill it.

Thanks!

Rajtilak Bhattacharjee

4 years ago

Hi,

Great question!

You can find a detailed explanation at the below link:

https://stats.stackexchange.com/questions/239076/about-cnn-kernels-and-scale-rotation-invariance

Thanks.

Gandharv Bakshi

4 years ago

In the lecture, we talk about a filter being defined as an array, but in the final classification, we define filter as conv1_fmaps = 32 . How is this filter applied - shouldn't this also be an array? Or does it mean that each individual pixel is multipled by 32?

Rajtilak Bhattacharjee

4 years ago

Hi,

Could you please refer me to the timestamp of that part of the video where we have a fliter defined by the number 32?

Thanks.

Gandharv Bakshi

4 years ago

Hi,

This is at 2:45:03 , though it is also in the jupyter notebook. Producing the screenshot below:

Rajtilak Bhattacharjee

4 years ago

Hi,

These are the individual components. If you notice just above that, we have height, width, channels etc. These are also the individual componenets. If you notice below you will see we are using them together to form maps and strides etc.

Thanks.

Gandharv Bakshi

4 years ago

Sorry for asking again. Like in the previous example, we had 2 filters - one vertical and one horizontal which had only 1st in a column and a row respectively. In this example, what is the filter array genereted?

Rajtilak Bhattacharjee

4 years ago

Hi,

There are 2 convolutional layers here, conv1 and conv2. For each, we have defined a fmap, ksize, stride, and a padding value. I would suggest you to consult the slides to understand what each of these does. Also, a filter will not always be a simple horizontal or vertical filter, it was shown that way so that you can understand it's function easily.

Thanks.

Gandharv Bakshi

4 years ago

Sure, I get that. But fmap is a simple number here? Isn't fmap the filter here - the reason I am confused is that a filter is represented by a single number as opposed to an array. ksize, stride and padding - I totally understand but my question is purely about the filter or fmap. Can a filter be a single number? is my understanding correct?

Rajtilak Bhattacharjee

4 years ago

Hi,

Here, filter is the dimensionality of the output space (i.e. the number of output filters in the convolution). It only accepts an integer input. Please go through the official documentation for more details:

https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2D

Thanks.

Sekar Mp

4 years ago

Hello,

Can you please explain on what basis we need to decide the number of filters, kernel size, stride, number of convolution layers, pooling layer, etc?

Regards,

Sekar MP

Rajtilak Bhattacharjee

4 years ago

Hi,

This should explain it in detail:

https://stackoverflow.com/questions/36243536/what-is-the-number-of-filter-in-cnn

Thanks.

Manjari .

5 years ago

is it possible to do something like ?

model = keras.sequential([conv2d,

where conv2d is a layer as created in this video

Vagdevi K

5 years ago

Hi,

You could do that. This might help you: https://keras.io/api/layers/convolution_layers/convolution2d/

Thanks.

SOUMYADEEP BANIK CHOWDHURY

5 years ago

architecture of resnet is not in the slide?

Satyajit Das

5 years ago

Hi,

You can refer the RESNET-50 architecture here : http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006

All the best!

Manjari Singh

5 years ago

According to the convolutional_neural_networks.ipynb

There are 2 ways for prediction,Am I right?

1, chain the layers, and take the output of softmax or dense

2.use model, compile and predict from model

Rajtilak Bhattacharjee

5 years ago

Hi,

Could you please tell me which part of the notebook are you referring to in each pointer?

Thanks.

Manjari Singh

5 years ago

section 3

and section 4

Rajtilak Bhattacharjee

5 years ago

Hi,

Could you please name the sections because RNN notebook does not have any section numbers.

Thanks.

Manjari Singh

5 years ago

it is the CNN notebook, I am referring to

Rajtilak Bhattacharjee

5 years ago

Hi,

Apologies for the typo. I mean the convolutional_neural_networks.ipynb notebook in our repository. If you look at that you will find that there are no section numbers. So it would help if you can mention the sections.

Thanks.

Manjari Singh

5 years ago

https://jupyter.e.cloudxlab.com/user/manjarisingh8687/notebooks/ml/deep_learning/convolutional_neural_networks.ipynb

Rajtilak Bhattacharjee

5 years ago

Hi,

Please refer to the below link and let me know the sections your are referring to:

https://github.com/cloudxlab/ml/blob/master/deep_learning/convolutional_neural_networks.ipynb

This is our GitHub repository, if you refer to this notebook you wil notice that there are no section numbers. So it would be very helpful if you would please tell me the name of the sections. Also, as mentioned in my previous comments, I would suggest you to go through the course materials first. Without going through them you will not be able to clear your concepts for Machine Learning or Deep Learning.

Thanks.

Manjari Singh

5 years ago

what does the line

image = china[150:220,130:250] do?

---------------------------

I did the same

import numpy
arr = numpy.array([[1,2,3],[3,4,5]])
sarr= arr[0:1,1:2]
print(arr)
print(sarr)

---------

, it gave ,

---------

output

[[1 2 3] [3 4 5]]

[[2]]

Rajtilak Bhattacharjee

5 years ago

Hi,

This splits the china image.

Thanks.

Manjari Singh

5 years ago

Are there tools/libraries in market, which do all these?

such as OpenCV?

This seems like too much manual work

Rajtilak Bhattacharjee

5 years ago

Hi,

Here we have explained the concepts. Without understanding the concepts, you would not ve able to understand which library to use and where.

Thanks.

Manjari Singh

5 years ago

All the types of CNN layers, and GoogLeNet etc, are most relevant for only image classification?

Rajtilak Bhattacharjee

5 years ago

Hi,

CNN has applications in image and video recognition, recommender systems, image classification, medical image analysis, natural language processing, and financial time series.

Thanks.

Rajtilak Bhattacharjee

5 years ago

Hi Manjari,

I checked from my end that you have not gone through most of the lecture videos in both Machine Learning and Deep Learning. I would suggest you to start from topic# 1, and move onwards only after you have gone through each and every assessment and lecture video in that topic. Without going through these topics, it would be next to impossible for you to put together the pieces in this course.

Thanks.

Vandana Kakkar

5 years ago

How we define the filter eg

[ : , 3, : , 0 ] =1 # vertical line

now , what is this colon defining and 3 defining, Is it length is 3 and width is all. And, the images with this dimension is equal to 1.? I have a little doubt about this

Rajtilak Bhattacharjee

5 years ago

Hi,

Would request you to go over the lecture video once again to understand the concept of filters. It has been explained in detail there.

Thanks.

Vandana Kakkar

5 years ago

Is stride, 4 number eg, (1,1,2,3) or 1 number eg, single number 2???

Rajtilak Bhattacharjee

5 years ago

Hi,

Stride is the number of pixels shifts over the input matrix. When the stride is 1 then we move the filters to 1 pixel at a time. When the stride is 2 then we move the filters to 2 pixels at a time and so on. Stride controls how the filter convolves around the input volume.

Thanks.

Vandana Kakkar

5 years ago

how one should take stride = 1 or 2 or 3, aren't we missing regular neurons data by just jumping the layers.

Rajtilak Bhattacharjee

5 years ago

Hi,

Would request you to go through the lecture to understand strides in details.

Thanks.

This comment has been removed.

This comment has been removed.

This comment has been removed.

This comment has been removed.

Queen Saikia

5 years ago

Hi,

what does these 2 lines mean in cnn code ?

fmap[3, 3, 0, 2] = 1

plot_image(fmap[:, :, 0, 2])

Rajtilak Bhattacharjee

5 years ago

Hi,

This means that those elements of the fmap array are being set to 1. You can check after every such line by printing the fmap array in a new cell.

Thanks.

Sharathchandran Komravelli

5 years ago

why is strides a four dimensional tensor

CloudxLab

5 years ago

Hi,

The inputs are 4 dimensional and are of form: [batch_size, image_rows, image_cols, number_of_colors]

You can find more information in this thread:
https://stackoverflow.com/q...
Thanks.

-- Rajtilak Bhattacharjee

Alok Bhardwaj

6 years ago

Hello
If CNN is trained on MNIST data with input shape 128*128, then if a new data of size say 130*130 is provided to the trained CNN for prediction then will the trained CNN be able to recognise the changed input shape and do the prediction? I ask this because the trained network connection weights are according to 128*128 shape.

Deepak Kumar Singh

6 years ago

Hi Alok,
You will need to resize the image to 128x128. For this you can use pillow library.

Thanks