Artificial Neural Network

2 / 4

Artificial Neural Networks - Session 01

Slides

INSTRUCTIONS

Latest Instructions for launching Tensorboard

If you are facing challenges opening Tensorboard, please visit the below link:

https://discuss.cloudxlab.com/t/solved-cannot-start-tensorboard-server/5146


No hints are availble for this assesment

Answer is not availble for this assesment

Please login to comment

83 Comments

Hi Team,

I am little bit confused with use of activation function in neurons. Please correct me if I am wrong:

1) The sole purpose of activation functions is to bring the output of neuron in discrete range from decimal values. If yes, what is the benefit of it ? 

2)Do we require differentiable activation functions just because while  back propagtion is happening output should not be 1 (like in case of step function) ?

Regards,

Birendra Singh 

  Upvote    Share

Hi,

1) Imagine a neural network without any activation function. Then the output is just an aggregation of inputs, weights and biases in a linear fashion. This is because the output of previous layer is multiplied with weights of next layer and added with biases, so this is nothing but linear function. It would not be possible to capture the complex non-linear boundaries of classification with simple linear functions since they output straightlines. Now if activation functions are introduced, they impart non-linearity because of the nature of their funtions. This output will be multiplied by the weight of the next layer and added with bias, and this new output will be again given to activation function. So the decision function gains nonlinearity with the introduction of the activation functions. This will help the algorithm to capture complex decision boundaries.

2) In backpropagation, we calculate the gradients. So it is expected that the activation functions are differentiable. 

Thanks.

  Upvote    Share

Thanks Vagdevi for quick reply.

On point 1, I am still not clear. Even if we have non-linear function then also aggregation of inputs , weights and biases is going to happen. So, why you call it as linear fashion ? And when we introduce activation function how they impart non-linearity in calculation ? Can you explain it like Sandeep explained but with including activation function . I am sharing the slide picture.

If you have any link where i can read the detailed expalaination it will be great.

Regards,

Birendra Singh

  Upvote    Share

Hi, is it necessary that the number of nodes in the hidden layer must be greater than number of features of input dataset? 
One more question....Like in MNIST dataset there are 784 column values for each row, so will there be 784 nodes in input layer?

  Upvote    Share

Hi,

In general, there is not way to calculate the number of nodes to use per layer. You can usually take 4 approaches: random, grid, heuristic, exhaustive. Hope this answers your query.

Thanks.

  Upvote    Share

At 55:57 we are finding new value of w21. Do we also find new values for other weights as well at the same time?
It would be a lengthy process to change values of all weights at every instance of X.
So my doubt is that can we have a approach to change the weights in cyclic order...like changing w1 for 1st instance, w2 for 2nd instance...and so on.

 1  Upvote    Share

Hi,

This has been shown as an example on how ANN works. However, this is taken care by the library that you are using.

Thanks.

  Upvote    Share

At time1:10:10, For explanation of backpropagation mathematical method we are not using any activation func for neuron h1.

1. Not using activation on h1,

same values as mentor took in vid, h1 = 4, o1 = 5.2639 (result after backpropagation) and actual y is 5.

2.  Not using activation on h1,

I am tweaking w11 = -1 and rest are same inputs now h1 = 0 and o1 = 4.96 (result after backpropagation) and actual y is 5. i realised that There is no improvement in w21.

3. using RELU activation on h1

I am tweaking w12 = -3 and rest are same inputs now h1 = -1 means 0 due to relu and o1 = 4.02  (result after backpropagation) and actual y is 5. i realised that There is no improvement in w21 and less improvement in result(o1) depends upon w22 which is bias.

 

So my question is  after backpropagation RELU act Func is not giving me good results when we feed negative values to relu which makes neuron inactive. then what is use of getting non-linearity with less accuracy using relu compare to linear?

yes, when we feed positive values then h1 remains active then it works very well and perfomance is really good and non linearity helps.

  Upvote    Share

Hi,

In general, ReLU should be avoided when the dataset has too many negative values. It is to be used only in case of exploding gradient problems. Otherwise, you can use sigmoid function.

Thanks.

  Upvote    Share

In case of backpropagation explanation in video, exactly what was the activation func used?

actually i am confused b/w y=mx+c relation and activation func in backpropagation explanation?

  Upvote    Share

Hi,

If you look at slide 52, it says the network described in the backprop video has one input feature, one output label, the error is calculated as Mean Square, one hidden layer having one neuron, and no activation function. Hope this answers your question.

Thanks.

  Upvote    Share

Thank you

  Upvote    Share

This comment has been removed.

can you explain what is learning curve (eta) at 42:00 and why we are subtracting it from weight.

  Upvote    Share

Hi,

The learning curve is a measurement on how well your model is learning. We are subtracting it from weight to check if increasing or decresing the weight will make it better.

Thanks.

  Upvote    Share

Could you please give me figure with sketch neurons so i can understand better?? of this program

  Upvote    Share

Hi,

You can try using Tensorboard, given below is a link which you would find useful:

https://discuss.cloudxlab.com/t/solved-cannot-start-tensorboard-server/5146

Thanks.

  Upvote    Share

i did not understand.

At 2:48:13

def neuron_layer What we are doing in this cell of this program??

with tf.name_scope("dnn"): please tell what you are doing here also??

Please tell both questions.

  Upvote    Share

Hi,

Here were are defining the 3 layers of neurons, 2 hidden layers, and 1 output layer.

Thanks.

  Upvote    Share

I did not understand.  at 2:14:13

a = y_test:=y_pred['classes']

i=0;

for j in a:

if j==0;

print(i)

i+=1

Please tell how are doing this??

Why have taken for loop??

Please do not tell me why you doing this because i know you are seeing actual of predict.

  Upvote    Share

Hi,

It is y_test != y_pred['classes']. This means we are creating an array 'a' which will contain True if  the predicted value does not match the actual value and False otherwise. In the loop we are iterating through all the elements in this array, and if it's False (j == 0), then we are printing the index of that element, and we are counting the total number of cases where our model predicted correctly.

Thanks.

  Upvote    Share

What is exlusive??

  Upvote    Share

Hi,

Could you please point me out where in the video did you find exclusive so that I can explain it's meaning?

Thanks.

  Upvote    Share

o1 =5.2639

but this value is more than 5.

 

  Upvote    Share

why you did not decrease w22 and w12 ??

why you only decrease w21??

  Upvote    Share

where did you get the value eta= 0.01 while calculating w21??

  Upvote    Share

Hi,

Did you go through the backpropagation video that I had shared with you? Here's the link once again:

https://cloudxlab.com/blog/backpropagation-from-scratch/

This will clear all your doubts.

Thanks.

  Upvote    Share

My code execution snapshot

_________________________________________

 

  Upvote    Share

Hello,

I have question from the presentation video:

1) What does this code mean? feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)

2) On running the code below I do not see INFO: message for various steps ( as shown in the screenshot)

import tensorflow as tf

config = tf.contrib.learn.RunConfig(tf_random_seed=42) # not shown in the config

feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)
dnn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[200, 200, 200], n_classes=10,
                                         feature_columns=feature_cols, config=config)
dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
dnn_clf.fit(X_train, y_train, batch_size=50, steps=30000)

3) Also while executing the code for accuracy I do not get messages about "temp" folder getting created with other messages.

I am sending the snapshot of my execution o/p following this message.

Thanks

  Upvote    Share

Hi,

The tf.contrib.learn.infer_real_valued_columns_from_input() function creates FeatureColumn objects for inputs defined by it's input. However, this function has been deprecated, please specify feature columns explicitly.

Thanks.

  Upvote    Share

Hi,

Is there an update to the code and document for the above? Please share the updated version for refrence.

Thanks

  Upvote    Share

Hi,

There is no updated code for this, as mentioned in my previous comment please specify the feature columns explicitly.

Thanks.

  Upvote    Share

HI Rajtilak,

I am really stuck and confused here after exploring for hours on this......need your help please.

As we uploaded the file using the command below from the location ml/deep_learning/data/mnist:

                                       from tensorflow.examples.tutorials.mnist import input_data

                                       mnist = input_data.read_data_sets("data/")

they were of the format -->

train-images-idx3-ubyte.gz
train-labels-idx1-ubyte.gz
t10k-images-idx3-ubyte.gz
t10k-labels-idx1-ubyte.gz

________________________________________________

  • I really do not know how to see feature columns from the data. Can you please let me know
  • Also, I am still unable to figure out what needs to be changed into the code? Please make the changes into it and help me understand.

>> 

 

import tensorflow as tf

config = tf.contrib.learn.RunConfig(tf_random_seed=42) # not shown in the config

feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)
dnn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[200, 200, 200], n_classes=10,
                                         feature_columns=feature_cols, config=config)
dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
dnn_clf.fit(X_train, y_train, batch_size=50, steps=30000)

_________________________________________________________________________________

Also,

The other parts of the queries were still answered as to why I am unable to see the various step results as well as the non-creation of temp folder during execution of code for accuracy.

Thanks

  Upvote    Share

Hi Rajtilak,

I am awaiting your response on the query sent yesterday as I am unable to figure out the replacement code. Please help!

Thanks

_____________________________________________________________________________________________________

 

HI Rajtilak,

I am really stuck and confused here after exploring for hours on this......need your help please.

As we uploaded the file using the command below from the location ml/deep_learning/data/mnist:

                                       from tensorflow.examples.tutorials.mnist import input_data

                                       mnist = input_data.read_data_sets("data/")

they were of the format -->

train-images-idx3-ubyte.gz
train-labels-idx1-ubyte.gz
t10k-images-idx3-ubyte.gz
t10k-labels-idx1-ubyte.gz

________________________________________________

  • I really do not know how to see feature columns from the data. Can you please let me know
  • Also, I am still unable to figure out what needs to be changed into the code? Please make the changes into it and help me understand.

>> 

 

import tensorflow as tf

config = tf.contrib.learn.RunConfig(tf_random_seed=42) # not shown in the config

feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)
dnn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[200, 200, 200], n_classes=10,
                                         feature_columns=feature_cols, config=config)
dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
dnn_clf.fit(X_train, y_train, batch_size=50, steps=30000)

  Upvote    Share

Hi,

Please refer to the Project-Fashion MNIST playlist in the Machine Learning part of the course. You will find all the details there on how you can load the dataset from .gz files, and how to identify the feature columns.

Thanks.

  Upvote    Share

Hi Rajtilak,

I have referred to the code from the Fashion ML project. Code snippet below (till the Feature Scaling part) -

filePath_train_set =  '/cxldata/datasets/project/fashion-mnist/train-images-idx3-ubyte.gz'
filePath_train_label =  '/cxldata/datasets/project/fashion-mnist/train-labels-idx1-ubyte.gz'
filePath_test_set = '/cxldata/datasets/project/fashion-mnist/t10k-images-idx3-ubyte.gz'
filePath_test_label = '/cxldata/datasets/project/fashion-mnist/t10k-labels-idx1-ubyte.gz'


with gzip.open(filePath_train_label, 'rb') as trainLbpath:
     trainLabel = np.frombuffer(trainLbpath.read(), dtype=np.uint8,
                               offset=8)
with gzip.open(filePath_train_set, 'rb') as trainSetpath:
     trainSet = np.frombuffer(trainSetpath.read(), dtype=np.uint8,
                               offset=16).reshape(len(trainLabel), 784)

with gzip.open(filePath_test_label, 'rb') as testLbpath:
     testLabel = np.frombuffer(testLbpath.read(), dtype=np.uint8,
                               offset=8)

with gzip.open(filePath_test_set, 'rb') as testSetpath:
     testSet = np.frombuffer(testSetpath.read(), dtype=np.uint8,
                               offset=16).reshape(len(testLabel), 784)

X_train, X_test, y_train, y_test = trainSet, testSet, trainLabel, testLabel

# Create a random seed=42
np.random.seed(42)

# Create shuffle indices of size 60000 and store it in a variable 'shuffle_index'

shuffle_index = np.random.permutation(60000)

# Shuffle the indices of X_train and y_train datasets by using 'shuffle_index' variable created above.

X_train, y_train = X_train[shuffle_index], y_train[shuffle_index]

# Feature scaling
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train.astype(np.float64))

 

 1  Upvote    Share

__________________________________________________________

So looking at this my question is that will the code in question will now read as (below)

feature_cols = X_train_scaled?

___________________________________

import tensorflow as tf

config = tf.contrib.learn.RunConfig(tf_random_seed=42) # not shown in the config

feature_cols = X_train_scaled

dnn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[200, 200, 200], n_classes=10,
                                         feature_columns=feature_cols, config=config)
dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
dnn_clf.fit(X_train, y_train, batch_size=50, steps=30000)

 1  Upvote    Share

Hi,

Instead of me answering that for you, why don't you go ahead and try that and let me know if it worked.

Thanks.

  Upvote    Share

Hi,

By the way, good job fitting the existing code in this one.

Thanks.

  Upvote    Share

Hi,

I did that and getting the error that I am unable to comprehend. Please look into it and advise where am I going wrong?

Thanks

  Upvote    Share

Hi,

Please find below the link to my personal calendar:

http://rajtilak.youcanbook.me/

Please scehdule a meeting with me sometime tomorrow, you will receive a Hangout link with the meeting invite. We can discuss this over the call.

Thanks.

  Upvote    Share

Hi Rajtilak,

I have booked the slot for tomorrow morning 11:30 AM. Looking forward to this meeting and positive discussions.

Regards

  Upvote    Share

Hi,

Yes, I got the notification.

Thanks.

  Upvote    Share

Hi Rajtilak,

It was great talking to you today. As per our conversation I did try to upgrade the TensorFlow onn Jupyter from 1 to 2 but even after trying many times I could not do it. Please see the error messages. Do let know if you are able to do it at your end.

Thanks

________________________________________________

Code snippet:

1)

!tf_upgrade_v2 --infile introduction_to_artificial_neural_networks_sv_copy.ipynb --outfile TF2_introduction_to_artificial_neural_networks_sv_copy.ipynb

ERROR MESSAGE: 

/usr/bin/sh: tf_upgrade_v2: command not found

2)

!tf_upgrade_v2 \  
--infile  introduction_to_artificial_neural_networks_sv_copy.ipynb \ 
--outfile  TF2_introduction_to_artificial_neural_networks_sv_copy.ipynb \
--reportfile tf2report.txt

ERROR MESSAGE:

File "<ipython-input-8-8dcf238751ad>", line 2
    --infile  introduction_to_artificial_neural_networks_sv_copy.ipynb \
                                                               ^
SyntaxError: invalid syntax

 

__________________________________________________

 

  Upvote    Share

Hi,

As of now I would suggest you to continue using the old function that you were using.

Thanks.

  Upvote    Share

Hello sir,

Is therer any good book for deep learning ? please suggest books...

  Upvote    Share

Hi,

Here is a list of ML/DL books, some of these are available for free:

https://cloudxlab.com/blog/gigantic-list-of-machine-learning-books/

Thanks.

  Upvote    Share

Thank you sir!!

  Upvote    Share
Hi,
I have install 1.14 version locally but showing this error with this code:
import tensorflow as tf
# from tensorflow.python.feature_column import dense_features
config = tf.contrib.learn.RunConfig(tf_random_seed=42) # not shown in the config

feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)
dnn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[200, 200, 200], n_classes=10,
                                         feature_columns=feature_cols, config=config)
dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
dnn_clf.fit(X_train, y_train, batch_size=50, steps=30000)

TypeError: Value passed to parameter 'labels' has DataType uint8 not in list of allowed values: int32, int64
  Upvote    Share

This comment has been removed.

Hi,

I have already replied to your mail, please check.

Thanks.

  Upvote    Share

This comment has been removed.

This comment has been removed.

This comment has been removed.

I have a question that how can we decide the ideal no. of value given to the parameters like Step,Hidden_units,etc?

do we have to do hyperparameter optimisation like we do in machine learning?

  Upvote    Share

Hi Anubhav,

Would suggest you to follow the lecture for an answer to your query. There are no rules to this. There is no one model fits all when it comes to ML/DL.

Thanks.

  Upvote    Share

This comment has been removed.

This comment has been removed.

Also what log loss predicts?

 

  Upvote    Share

Hi,

Logarithmic loss measures the performance of a classification model where the prediction input is a probability value between 0 and 1. The goal of our machine learning models is to minimize this value. A perfect model would have a log loss of 0. Log loss increases as the predicted probability diverges from the actual label. So predicting a probability of .012 when the actual observation label is 1 would be bad and result in a high log loss.

Thanks.

  Upvote    Share

1.When running ANN for MNist dataset we are calculating loss for final step so what is meant by loss for final step? please explain
2.also in classification part it was told that if accuracy is above 95% then there is overfitting. then here accuracy is 98% is there no overfitting?

  Upvote    Share

Hi,

1. Loss is nothing but a prediction error of Neural Net. And the method to calculate the loss is called Loss Function.

2. Overfitting refers to a model that models the training data too well. Overfitting happens when a model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data. So just because a model has 95% accuracy does not mean it's overfitting. Also, Neural Nets generally have higher accuracy percentages.

Thanks.

  Upvote    Share

Okay that means Apart from Deep learning if we use other methods like svm decision tree etc then abve 95% accuracy there is overfitting?

Also now how we will know that there is overfitting in neural network??

  Upvote    Share

Hi,

Not really. To check for overfitting, the best way is to measure error on a training and test set. If you see low error on training set and high error on test & validation set then you have likely over-fitted the model.

Thanks.

  Upvote    Share

I m getting this error while running the code.
plz clarify..

  Upvote    Share

Hi,

Could you please confirm from which notebook are you trying to run this code?

Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share

Hello
I m running using cloudxlab , jupiter notebook only.

  Upvote    Share

Hi,

I understand you are trying to run a Jupyter notebook provided by us, but could you please point out to me which notebook you are trying to run because the code looks somewhat different.

Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share

hello
I have copied the code from the notebook named "introduction to artificial neural network" to a new notebook n run it. I have executed this code in ur notebook as well and it is giving same error.

Thanks n Regards..

  Upvote    Share

Hi,

Please note that this is not an error but a warning, and the output you are getting is the intended output.

Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share

Hi
How should we decide hidden units, steps & batch_size for for training a DNN Classifier?

  Upvote    Share

Hi,

There are not hard and fast rule for these, however, here are a few thumb rules you can follow:

The number of hidden neurons should be between the size of the input layer and the size of the output layer.
The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output layer.
The number of hidden neurons should be less than twice the size of the input layer.

You can go through the below link regarding the batch size:
https://stats.stackexchange...
Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share


why the first layer is passed as an argument when calling the function for creating second layer ?

  Upvote    Share

Hi,

This is done so that the data can flow from first layer to second layer.
Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share

How softmax function is helpful in output layer in case of MLP when classes are exclusive?

  Upvote    Share

Hi,

The softmax Regression, or the Maximum Entropy (MaxEnt) Classifier, is an activation function that turns numbers aka logits into probabilities that sum to one. Softmax function outputs a vector that represents the probability distributions of a list of potential outcomes. When the classes are exclusive, the output layer is typically modified by replacing the individual activation functions by a shared softmax function.

Thanks.

-- Rajtilak Bhattacharjee

  Upvote    Share

the mnist example given for ANN how can we do in tensorflow 2.1

  Upvote    Share

Hi, Shivom.

You can install the tensorflow in your environment and it will install the current version automatically.

But there is slight syntax change on the tensorflow, if you get the error in the tensorflow code you just search in Google and you will be able top get the correct format. But the concepts like placeholder, variables, constant and sessions, creating computational graphs execution is all same.

All the best!

  Upvote    Share

This is already done the mnist avaiable in https://www.tensorflow.org/... link gives data set of different size with the one used in example

  Upvote    Share

-- Please reply above this line --

replied
--
Best,
Satyajit Das

  Upvote    Share

I did not get what is required ?

  Upvote    Share