Artificial Neural Network

2 / 4

Previous Index Next

Artificial Neural Networks - Session 01

Slides

INSTRUCTIONS

Latest Instructions for launching Tensorboard

If you are facing challenges opening Tensorboard, please visit the below link:

https://discuss.cloudxlab.com/t/solved-cannot-start-tensorboard-server/5146

Previous Index Next

Please login to comment

83 Comments

Birendra Singh

4 years ago

Hi Team,

I am little bit confused with use of activation function in neurons. Please correct me if I am wrong:

1) The sole purpose of activation functions is to bring the output of neuron in discrete range from decimal values. If yes, what is the benefit of it ?

2)Do we require differentiable activation functions just because while back propagtion is happening output should not be 1 (like in case of step function) ?

Regards,

Birendra Singh

Vagdevi K

4 years ago

Hi,

1) Imagine a neural network without any activation function. Then the output is just an aggregation of inputs, weights and biases in a linear fashion. This is because the output of previous layer is multiplied with weights of next layer and added with biases, so this is nothing but linear function. It would not be possible to capture the complex non-linear boundaries of classification with simple linear functions since they output straightlines. Now if activation functions are introduced, they impart non-linearity because of the nature of their funtions. This output will be multiplied by the weight of the next layer and added with bias, and this new output will be again given to activation function. So the decision function gains nonlinearity with the introduction of the activation functions. This will help the algorithm to capture complex decision boundaries.

2) In backpropagation, we calculate the gradients. So it is expected that the activation functions are differentiable.

Thanks.

Birendra Singh

4 years ago

Thanks Vagdevi for quick reply.

On point 1, I am still not clear. Even if we have non-linear function then also aggregation of inputs , weights and biases is going to happen. So, why you call it as linear fashion ? And when we introduce activation function how they impart non-linearity in calculation ? Can you explain it like Sandeep explained but with including activation function . I am sharing the slide picture.

If you have any link where i can read the detailed expalaination it will be great.

Regards,

Birendra Singh

Vagdevi K

4 years ago

Hi,

Feel free to go through https://stackoverflow.com/questions/9782071/why-must-a-nonlinear-activation-function-be-used-in-a-backpropagation-neural-net

Thanks.

Vivek Bohra

4 years ago

Hi, is it necessary that the number of nodes in the hidden layer must be greater than number of features of input dataset?
One more question....Like in MNIST dataset there are 784 column values for each row, so will there be 784 nodes in input layer?

Rajtilak Bhattacharjee

4 years ago

Hi,

In general, there is not way to calculate the number of nodes to use per layer. You can usually take 4 approaches: random, grid, heuristic, exhaustive. Hope this answers your query.

Thanks.

Vivek Bohra

4 years ago

At 55:57 we are finding new value of w21. Do we also find new values for other weights as well at the same time?
It would be a lengthy process to change values of all weights at every instance of X.
So my doubt is that can we have a approach to change the weights in cyclic order...like changing w1 for 1st instance, w2 for 2nd instance...and so on.

Rajtilak Bhattacharjee

4 years ago

Hi,

This has been shown as an example on how ANN works. However, this is taken care by the library that you are using.

Thanks.

Gaurav Karki

4 years ago

At time1:10:10, For explanation of backpropagation mathematical method we are not using any activation func for neuron h1.

1. Not using activation on h1,

same values as mentor took in vid, h1 = 4, o1 = 5.2639 (result after backpropagation) and actual y is 5.

2. Not using activation on h1,

I am tweaking w11 = -1 and rest are same inputs now h1 = 0 and o1 = 4.96 (result after backpropagation) and actual y is 5. i realised that There is no improvement in w21.

3. using RELU activation on h1

I am tweaking w12 = -3 and rest are same inputs now h1 = -1 means 0 due to relu and o1 = 4.02 (result after backpropagation) and actual y is 5. i realised that There is no improvement in w21 and less improvement in result(o1) depends upon w22 which is bias.

So my question is after backpropagation RELU act Func is not giving me good results when we feed negative values to relu which makes neuron inactive. then what is use of getting non-linearity with less accuracy using relu compare to linear?

yes, when we feed positive values then h1 remains active then it works very well and perfomance is really good and non linearity helps.

Rajtilak Bhattacharjee

4 years ago

Hi,

In general, ReLU should be avoided when the dataset has too many negative values. It is to be used only in case of exploding gradient problems. Otherwise, you can use sigmoid function.

Thanks.

Rajtilak Bhattacharjee

4 years ago

Hi,

You can also look at the below link:

https://stackoverflow.com/questions/43371117/relu-not-learning-to-handle-negative-inputs-keras-tensorflow

Thanks.

Gaurav Karki

4 years ago

In case of backpropagation explanation in video, exactly what was the activation func used?

actually i am confused b/w y=mx+c relation and activation func in backpropagation explanation?

Rajtilak Bhattacharjee

4 years ago

Hi,

If you look at slide 52, it says the network described in the backprop video has one input feature, one output label, the error is calculated as Mean Square, one hidden layer having one neuron, and no activation function. Hope this answers your question.

Thanks.

Gaurav Karki

4 years ago

Thank you

This comment has been removed.

Harshal Kothawade

4 years ago

can you explain what is learning curve (eta) at 42:00 and why we are subtracting it from weight.

Rajtilak Bhattacharjee

4 years ago

Hi,

The learning curve is a measurement on how well your model is learning. We are subtracting it from weight to check if increasing or decresing the weight will make it better.

Thanks.

Nirav Raj

4 years ago

Could you please give me figure with sketch neurons so i can understand better?? of this program

Rajtilak Bhattacharjee

4 years ago

Hi,

You can try using Tensorboard, given below is a link which you would find useful:

https://discuss.cloudxlab.com/t/solved-cannot-start-tensorboard-server/5146

Thanks.

Nirav Raj

4 years ago

i did not understand.

At 2:48:13

def neuron_layer What we are doing in this cell of this program??

with tf.name_scope("dnn"): please tell what you are doing here also??

Please tell both questions.

Rajtilak Bhattacharjee

4 years ago

Hi,

Here were are defining the 3 layers of neurons, 2 hidden layers, and 1 output layer.

Thanks.

Nirav Raj

4 years ago

I did not understand. at 2:14:13

a = y_test:=y_pred['classes']

i=0;

for j in a:

if j==0;

print(i)

i+=1

Please tell how are doing this??

Why have taken for loop??

Please do not tell me why you doing this because i know you are seeing actual of predict.

Rajtilak Bhattacharjee

4 years ago

Hi,

It is y_test != y_pred['classes']. This means we are creating an array 'a' which will contain True if the predicted value does not match the actual value and False otherwise. In the loop we are iterating through all the elements in this array, and if it's False (j == 0), then we are printing the index of that element, and we are counting the total number of cases where our model predicted correctly.

Thanks.

Nirav Raj

4 years ago

What is exlusive??

Rajtilak Bhattacharjee

4 years ago

Hi,

Could you please point me out where in the video did you find exclusive so that I can explain it's meaning?

Thanks.

Nirav Raj

4 years ago

o1 =5.2639

but this value is more than 5.

Nirav Raj

4 years ago

why you did not decrease w22 and w12 ??

why you only decrease w21??

Nirav Raj

4 years ago

where did you get the value eta= 0.01 while calculating w21??

Rajtilak Bhattacharjee

4 years ago

Hi,

Did you go through the backpropagation video that I had shared with you? Here's the link once again:

https://cloudxlab.com/blog/backpropagation-from-scratch/

This will clear all your doubts.

Thanks.

Shashwat Verma

4 years ago

My code execution snapshot

_________________________________________

Shashwat Verma

4 years ago

Hello,

I have question from the presentation video:

1) What does this code mean? feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)

2) On running the code below I do not see INFO: message for various steps ( as shown in the screenshot)

import tensorflow as tf

config = tf.contrib.learn.RunConfig(tf_random_seed=42) # not shown in the config

feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)
dnn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[200, 200, 200], n_classes=10,
feature_columns=feature_cols, config=config)
dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
dnn_clf.fit(X_train, y_train, batch_size=50, steps=30000)

3) Also while executing the code for accuracy I do not get messages about "temp" folder getting created with other messages.

I am sending the snapshot of my execution o/p following this message.

Thanks

Rajtilak Bhattacharjee

4 years ago

Hi,

The tf.contrib.learn.infer_real_valued_columns_from_input() function creates FeatureColumn objects for inputs defined by it's input. However, this function has been deprecated, please specify feature columns explicitly.

Thanks.

Shashwat Verma

4 years ago

Hi,

Is there an update to the code and document for the above? Please share the updated version for refrence.

Thanks

Rajtilak Bhattacharjee

4 years ago

Hi,

There is no updated code for this, as mentioned in my previous comment please specify the feature columns explicitly.

Thanks.

Shashwat Verma

4 years ago

HI Rajtilak,

I am really stuck and confused here after exploring for hours on this......need your help please.

As we uploaded the file using the command below from the location ml/deep_learning/data/mnist:

from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("data/")

they were of the format -->

train-images-idx3-ubyte.gz

train-labels-idx1-ubyte.gz

t10k-images-idx3-ubyte.gz

t10k-labels-idx1-ubyte.gz

________________________________________________

I really do not know how to see feature columns from the data. Can you please let me know
Also, I am still unable to figure out what needs to be changed into the code? Please make the changes into it and help me understand.

>>

import tensorflow as tf

config = tf.contrib.learn.RunConfig(tf_random_seed=42) # not shown in the config

feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)
dnn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[200, 200, 200], n_classes=10,
feature_columns=feature_cols, config=config)
dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
dnn_clf.fit(X_train, y_train, batch_size=50, steps=30000)

_________________________________________________________________________________

Also,

The other parts of the queries were still answered as to why I am unable to see the various step results as well as the non-creation of temp folder during execution of code for accuracy.

Thanks

Shashwat Verma

4 years ago

Hi Rajtilak,

I am awaiting your response on the query sent yesterday as I am unable to figure out the replacement code. Please help!

Thanks

_____________________________________________________________________________________________________

HI Rajtilak,

I am really stuck and confused here after exploring for hours on this......need your help please.

As we uploaded the file using the command below from the location ml/deep_learning/data/mnist:

from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("data/")

they were of the format -->

train-images-idx3-ubyte.gz

train-labels-idx1-ubyte.gz

t10k-images-idx3-ubyte.gz

t10k-labels-idx1-ubyte.gz

________________________________________________

I really do not know how to see feature columns from the data. Can you please let me know
Also, I am still unable to figure out what needs to be changed into the code? Please make the changes into it and help me understand.

>>

import tensorflow as tf

config = tf.contrib.learn.RunConfig(tf_random_seed=42) # not shown in the config

feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)
dnn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[200, 200, 200], n_classes=10,
feature_columns=feature_cols, config=config)
dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
dnn_clf.fit(X_train, y_train, batch_size=50, steps=30000)

Rajtilak Bhattacharjee

4 years ago

Hi,

Please refer to the Project-Fashion MNIST playlist in the Machine Learning part of the course. You will find all the details there on how you can load the dataset from .gz files, and how to identify the feature columns.

Thanks.

Shashwat Verma

4 years ago

Hi Rajtilak,

I have referred to the code from the Fashion ML project. Code snippet below (till the Feature Scaling part) -

filePath_train_set = '/cxldata/datasets/project/fashion-mnist/train-images-idx3-ubyte.gz'
filePath_train_label = '/cxldata/datasets/project/fashion-mnist/train-labels-idx1-ubyte.gz'
filePath_test_set = '/cxldata/datasets/project/fashion-mnist/t10k-images-idx3-ubyte.gz'
filePath_test_label = '/cxldata/datasets/project/fashion-mnist/t10k-labels-idx1-ubyte.gz'

with gzip.open(filePath_train_label, 'rb') as trainLbpath:
trainLabel = np.frombuffer(trainLbpath.read(), dtype=np.uint8,
offset=8)
with gzip.open(filePath_train_set, 'rb') as trainSetpath:
trainSet = np.frombuffer(trainSetpath.read(), dtype=np.uint8,
offset=16).reshape(len(trainLabel), 784)

with gzip.open(filePath_test_label, 'rb') as testLbpath:
testLabel = np.frombuffer(testLbpath.read(), dtype=np.uint8,
offset=8)

with gzip.open(filePath_test_set, 'rb') as testSetpath:
testSet = np.frombuffer(testSetpath.read(), dtype=np.uint8,
offset=16).reshape(len(testLabel), 784)

X_train, X_test, y_train, y_test = trainSet, testSet, trainLabel, testLabel

# Create a random seed=42
np.random.seed(42)

# Create shuffle indices of size 60000 and store it in a variable 'shuffle_index'

shuffle_index = np.random.permutation(60000)

# Shuffle the indices of X_train and y_train datasets by using 'shuffle_index' variable created above.

X_train, y_train = X_train[shuffle_index], y_train[shuffle_index]

# Feature scaling
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train.astype(np.float64))

Shashwat Verma

4 years ago

__________________________________________________________

So looking at this my question is that will the code in question will now read as (below)

feature_cols = X_train_scaled?

___________________________________

import tensorflow as tf

config = tf.contrib.learn.RunConfig(tf_random_seed=42) # not shown in the config

feature_cols = X_train_scaled

dnn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[200, 200, 200], n_classes=10,
feature_columns=feature_cols, config=config)
dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
dnn_clf.fit(X_train, y_train, batch_size=50, steps=30000)

Rajtilak Bhattacharjee

4 years ago

Hi,

Instead of me answering that for you, why don't you go ahead and try that and let me know if it worked.

Thanks.

Rajtilak Bhattacharjee

4 years ago

Hi,

By the way, good job fitting the existing code in this one.

Thanks.

Shashwat Verma

4 years ago

Hi,

I did that and getting the error that I am unable to comprehend. Please look into it and advise where am I going wrong?

Thanks

Rajtilak Bhattacharjee

4 years ago

Hi,

Please find below the link to my personal calendar:

http://rajtilak.youcanbook.me/

Please scehdule a meeting with me sometime tomorrow, you will receive a Hangout link with the meeting invite. We can discuss this over the call.

Thanks.

Shashwat Verma

4 years ago

Hi Rajtilak,

I have booked the slot for tomorrow morning 11:30 AM. Looking forward to this meeting and positive discussions.

Regards

Rajtilak Bhattacharjee

4 years ago

Hi,

Yes, I got the notification.

Thanks.

Shashwat Verma

4 years ago

Hi Rajtilak,

It was great talking to you today. As per our conversation I did try to upgrade the TensorFlow onn Jupyter from 1 to 2 but even after trying many times I could not do it. Please see the error messages. Do let know if you are able to do it at your end.

Thanks

________________________________________________

Code snippet:

1)

!tf_upgrade_v2 --infile introduction_to_artificial_neural_networks_sv_copy.ipynb --outfile TF2_introduction_to_artificial_neural_networks_sv_copy.ipynb

ERROR MESSAGE:

/usr/bin/sh: tf_upgrade_v2: command not found

2)

!tf_upgrade_v2 \
--infile introduction_to_artificial_neural_networks_sv_copy.ipynb \
--outfile TF2_introduction_to_artificial_neural_networks_sv_copy.ipynb \
--reportfile tf2report.txt

ERROR MESSAGE:

File "<ipython-input-8-8dcf238751ad>", line 2
    --infile  introduction_to_artificial_neural_networks_sv_copy.ipynb \
                                                               ^
SyntaxError: invalid syntax

__________________________________________________

Rajtilak Bhattacharjee

4 years ago

Hi,

As of now I would suggest you to continue using the old function that you were using.

Thanks.

jayshree rathod

5 years ago

Hello sir,

Is therer any good book for deep learning ? please suggest books...

Rajtilak Bhattacharjee

5 years ago

Hi,

Here is a list of ML/DL books, some of these are available for free:

https://cloudxlab.com/blog/gigantic-list-of-machine-learning-books/

Thanks.

jayshree rathod

5 years ago

Thank you sir!!

Queen Saikia

5 years ago

Hi,
I have install 1.14 version locally but showing this error with this code:
import tensorflow as tf
# from tensorflow.python.feature_column import dense_features
config = tf.contrib.learn.RunConfig(tf_random_seed=42) # not shown in the config

feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)
dnn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[200, 200, 200], n_classes=10,
                                         feature_columns=feature_cols, config=config)
dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
dnn_clf.fit(X_train, y_train, batch_size=50, steps=30000)

TypeError: Value passed to parameter 'labels' has DataType uint8 not in list of allowed values: int32, int64

This comment has been removed.

Rajtilak Bhattacharjee

5 years ago

Hi,

I have already replied to your mail, please check.

Thanks.

This comment has been removed.

This comment has been removed.

This comment has been removed.

Anubhav Gupta

5 years ago

I have a question that how can we decide the ideal no. of value given to the parameters like Step,Hidden_units,etc?

do we have to do hyperparameter optimisation like we do in machine learning?

Rajtilak Bhattacharjee

5 years ago

Hi Anubhav,

Would suggest you to follow the lecture for an answer to your query. There are no rules to this. There is no one model fits all when it comes to ML/DL.

Thanks.

This comment has been removed.

This comment has been removed.

Queen Saikia

5 years ago

Also what log loss predicts?

Rajtilak Bhattacharjee

5 years ago

Hi,

Logarithmic loss measures the performance of a classification model where the prediction input is a probability value between 0 and 1. The goal of our machine learning models is to minimize this value. A perfect model would have a log loss of 0. Log loss increases as the predicted probability diverges from the actual label. So predicting a probability of .012 when the actual observation label is 1 would be bad and result in a high log loss.

Thanks.

Queen Saikia

5 years ago

1.When running ANN for MNist dataset we are calculating loss for final step so what is meant by loss for final step? please explain
2.also in classification part it was told that if accuracy is above 95% then there is overfitting. then here accuracy is 98% is there no overfitting?

Rajtilak Bhattacharjee

5 years ago

Hi,

1. Loss is nothing but a prediction error of Neural Net. And the method to calculate the loss is called Loss Function.

2. Overfitting refers to a model that models the training data too well. Overfitting happens when a model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data. So just because a model has 95% accuracy does not mean it's overfitting. Also, Neural Nets generally have higher accuracy percentages.

Thanks.

Queen Saikia

5 years ago

Okay that means Apart from Deep learning if we use other methods like svm decision tree etc then abve 95% accuracy there is overfitting?

Also now how we will know that there is overfitting in neural network??

Rajtilak Bhattacharjee

5 years ago

Hi,

Not really. To check for overfitting, the best way is to measure error on a training and test set. If you see low error on training set and high error on test & validation set then you have likely over-fitted the model.

Thanks.

Tariq Meenai

5 years ago

I m getting this error while running the code.
plz clarify..

CloudxLab

5 years ago

Hi,

Could you please confirm from which notebook are you trying to run this code?

Thanks.

-- Rajtilak Bhattacharjee

Tariq Meenai

5 years ago

Hello
I m running using cloudxlab , jupiter notebook only.

CloudxLab

5 years ago

Hi,

I understand you are trying to run a Jupyter notebook provided by us, but could you please point out to me which notebook you are trying to run because the code looks somewhat different.

Thanks.

-- Rajtilak Bhattacharjee

Tariq Meenai

5 years ago

hello
I have copied the code from the notebook named "introduction to artificial neural network" to a new notebook n run it. I have executed this code in ur notebook as well and it is giving same error.

Thanks n Regards..

CloudxLab

5 years ago

Hi,

Please note that this is not an error but a warning, and the output you are getting is the intended output.

Thanks.

-- Rajtilak Bhattacharjee

Prachi Singla

5 years ago

Hi
How should we decide hidden units, steps & batch_size for for training a DNN Classifier?

CloudxLab

5 years ago

Hi,

There are not hard and fast rule for these, however, here are a few thumb rules you can follow:

The number of hidden neurons should be between the size of the input layer and the size of the output layer.
The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output layer.
The number of hidden neurons should be less than twice the size of the input layer.

You can go through the below link regarding the batch size:
https://stats.stackexchange...
Thanks.

-- Rajtilak Bhattacharjee

Sharathchandran Komravelli

5 years ago

why the first layer is passed as an argument when calling the function for creating second layer ?

CloudxLab

5 years ago

Hi,

This is done so that the data can flow from first layer to second layer.
Thanks.

-- Rajtilak Bhattacharjee

Kedar Prasad

5 years ago

How softmax function is helpful in output layer in case of MLP when classes are exclusive?

CloudxLab

5 years ago

Hi,

The softmax Regression, or the Maximum Entropy (MaxEnt) Classifier, is an activation function that turns numbers aka logits into probabilities that sum to one. Softmax function outputs a vector that represents the probability distributions of a list of potential outcomes. When the classes are exclusive, the output layer is typically modified by replacing the individual activation functions by a shared softmax function.

Thanks.

-- Rajtilak Bhattacharjee

Shivom Srivastava

5 years ago

the mnist example given for ANN how can we do in tensorflow 2.1

Satyajit Das

5 years ago

Hi, Shivom.

You can install the tensorflow in your environment and it will install the current version automatically.

But there is slight syntax change on the tensorflow, if you get the error in the tensorflow code you just search in Google and you will be able top get the correct format. But the concepts like placeholder, variables, constant and sessions, creating computational graphs execution is all same.

All the best!

Shivom Srivastava

5 years ago

This is already done the mnist avaiable in https://www.tensorflow.org/... link gives data set of different size with the one used in example

CloudxLab

5 years ago

-- Please reply above this line --

replied
--
Best,
Satyajit Das

Shivom Srivastava

5 years ago

I did not get what is required ?