Latest Instructions for launching Tensorboard
If you are facing challenges opening Tensorboard, please visit the below link:
https://discuss.cloudxlab.com/t/solved-cannot-start-tensorboard-server/5146
Taking you to the next exercise in seconds...
Want to create exercises like this yourself? Click here.
No hints are availble for this assesment
Answer is not availble for this assesment
Please login to comment
83 Comments
Hi Team,
I am little bit confused with use of activation function in neurons. Please correct me if I am wrong:
1) The sole purpose of activation functions is to bring the output of neuron in discrete range from decimal values. If yes, what is the benefit of it ?
2)Do we require differentiable activation functions just because while back propagtion is happening output should not be 1 (like in case of step function) ?
Regards,
Birendra Singh
Upvote ShareHi,
1) Imagine a neural network without any activation function. Then the output is just an aggregation of inputs, weights and biases in a linear fashion. This is because the output of previous layer is multiplied with weights of next layer and added with biases, so this is nothing but linear function. It would not be possible to capture the complex non-linear boundaries of classification with simple linear functions since they output straightlines. Now if activation functions are introduced, they impart non-linearity because of the nature of their funtions. This output will be multiplied by the weight of the next layer and added with bias, and this new output will be again given to activation function. So the decision function gains nonlinearity with the introduction of the activation functions. This will help the algorithm to capture complex decision boundaries.
2) In backpropagation, we calculate the gradients. So it is expected that the activation functions are differentiable.
Thanks.
Upvote ShareThanks Vagdevi for quick reply.
On point 1, I am still not clear. Even if we have non-linear function then also aggregation of inputs , weights and biases is going to happen. So, why you call it as linear fashion ? And when we introduce activation function how they impart non-linearity in calculation ? Can you explain it like Sandeep explained but with including activation function . I am sharing the slide picture.
If you have any link where i can read the detailed expalaination it will be great.
Regards,
Birendra Singh
Upvote ShareHi,
Feel free to go through https://stackoverflow.com/questions/9782071/why-must-a-nonlinear-activation-function-be-used-in-a-backpropagation-neural-net
Thanks.
1 Upvote ShareHi, is it necessary that the number of nodes in the hidden layer must be greater than number of features of input dataset?
Upvote ShareOne more question....Like in MNIST dataset there are 784 column values for each row, so will there be 784 nodes in input layer?
Hi,
In general, there is not way to calculate the number of nodes to use per layer. You can usually take 4 approaches: random, grid, heuristic, exhaustive. Hope this answers your query.
Thanks.
Upvote ShareAt 55:57 we are finding new value of w21. Do we also find new values for other weights as well at the same time?
1 Upvote ShareIt would be a lengthy process to change values of all weights at every instance of X.
So my doubt is that can we have a approach to change the weights in cyclic order...like changing w1 for 1st instance, w2 for 2nd instance...and so on.
Hi,
This has been shown as an example on how ANN works. However, this is taken care by the library that you are using.
Thanks.
Upvote ShareAt time1:10:10, For explanation of backpropagation mathematical method we are not using any activation func for neuron h1.
1. Not using activation on h1,
same values as mentor took in vid, h1 = 4, o1 = 5.2639 (result after backpropagation) and actual y is 5.
2. Not using activation on h1,
I am tweaking w11 = -1 and rest are same inputs now h1 = 0 and o1 = 4.96 (result after backpropagation) and actual y is 5. i realised that There is no improvement in w21.
3. using RELU activation on h1
I am tweaking w12 = -3 and rest are same inputs now h1 = -1 means 0 due to relu and o1 = 4.02 (result after backpropagation) and actual y is 5. i realised that There is no improvement in w21 and less improvement in result(o1) depends upon w22 which is bias.
So my question is after backpropagation RELU act Func is not giving me good results when we feed negative values to relu which makes neuron inactive. then what is use of getting non-linearity with less accuracy using relu compare to linear?
yes, when we feed positive values then h1 remains active then it works very well and perfomance is really good and non linearity helps.
Upvote ShareHi,
In general, ReLU should be avoided when the dataset has too many negative values. It is to be used only in case of exploding gradient problems. Otherwise, you can use sigmoid function.
Thanks.
Upvote ShareHi,
You can also look at the below link:
https://stackoverflow.com/questions/43371117/relu-not-learning-to-handle-negative-inputs-keras-tensorflow
Thanks.
1 Upvote ShareIn case of backpropagation explanation in video, exactly what was the activation func used?
actually i am confused b/w y=mx+c relation and activation func in backpropagation explanation?
Upvote ShareHi,
If you look at slide 52, it says the network described in the backprop video has one input feature, one output label, the error is calculated as Mean Square, one hidden layer having one neuron, and no activation function. Hope this answers your question.
Thanks.
Upvote ShareThank you
Upvote ShareThis comment has been removed.
can you explain what is learning curve (eta) at 42:00 and why we are subtracting it from weight.
Upvote ShareHi,
The learning curve is a measurement on how well your model is learning. We are subtracting it from weight to check if increasing or decresing the weight will make it better.
Thanks.
Upvote ShareCould you please give me figure with sketch neurons so i can understand better?? of this program
Upvote ShareHi,
You can try using Tensorboard, given below is a link which you would find useful:
https://discuss.cloudxlab.com/t/solved-cannot-start-tensorboard-server/5146
Thanks.
Upvote Sharei did not understand.
At 2:48:13
def neuron_layer What we are doing in this cell of this program??
with tf.name_scope("dnn"): please tell what you are doing here also??
Please tell both questions.
Upvote ShareHi,
Here were are defining the 3 layers of neurons, 2 hidden layers, and 1 output layer.
Thanks.
Upvote ShareI did not understand. at 2:14:13
a = y_test:=y_pred['classes']
i=0;
for j in a:
if j==0;
print(i)
i+=1
Please tell how are doing this??
Why have taken for loop??
Please do not tell me why you doing this because i know you are seeing actual of predict.
Upvote ShareHi,
It is y_test != y_pred['classes']. This means we are creating an array 'a' which will contain True if the predicted value does not match the actual value and False otherwise. In the loop we are iterating through all the elements in this array, and if it's False (j == 0), then we are printing the index of that element, and we are counting the total number of cases where our model predicted correctly.
Thanks.
Upvote ShareWhat is exlusive??
Upvote ShareHi,
Could you please point me out where in the video did you find exclusive so that I can explain it's meaning?
Thanks.
Upvote Shareo1 =5.2639
but this value is more than 5.
why you did not decrease w22 and w12 ??
why you only decrease w21??
Upvote Sharewhere did you get the value eta= 0.01 while calculating w21??
Upvote ShareHi,
Did you go through the backpropagation video that I had shared with you? Here's the link once again:
https://cloudxlab.com/blog/backpropagation-from-scratch/
This will clear all your doubts.
Thanks.
Upvote ShareMy code execution snapshot
_________________________________________
Hello,
I have question from the presentation video:
1) What does this code mean? feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)
2) On running the code below I do not see INFO: message for various steps ( as shown in the screenshot)
import tensorflow as tf
config = tf.contrib.learn.RunConfig(tf_random_seed=42) # not shown in the config
feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)
dnn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[200, 200, 200], n_classes=10,
feature_columns=feature_cols, config=config)
dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
dnn_clf.fit(X_train, y_train, batch_size=50, steps=30000)
3) Also while executing the code for accuracy I do not get messages about "temp" folder getting created with other messages.
I am sending the snapshot of my execution o/p following this message.
Thanks
Hi,
The tf.contrib.learn.infer_real_valued_columns_from_input() function creates
FeatureColumn
objects for inputs defined by it's input. However, this function has been deprecated, please specify feature columns explicitly.Thanks.
Upvote ShareHi,
Is there an update to the code and document for the above? Please share the updated version for refrence.
Thanks
Upvote ShareHi,
There is no updated code for this, as mentioned in my previous comment please specify the feature columns explicitly.
Thanks.
Upvote ShareHI Rajtilak,
I am really stuck and confused here after exploring for hours on this......need your help please.
As we uploaded the file using the command below from the location ml/deep_learning/data/mnist:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("data/")
they were of the format -->
________________________________________________
>>
import tensorflow as tf
config = tf.contrib.learn.RunConfig(tf_random_seed=42) # not shown in the config
feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)
dnn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[200, 200, 200], n_classes=10,
feature_columns=feature_cols, config=config)
dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
dnn_clf.fit(X_train, y_train, batch_size=50, steps=30000)
_________________________________________________________________________________
Also,
The other parts of the queries were still answered as to why I am unable to see the various step results as well as the non-creation of temp folder during execution of code for accuracy.
Thanks
Upvote ShareHi Rajtilak,
I am awaiting your response on the query sent yesterday as I am unable to figure out the replacement code. Please help!
Thanks
_____________________________________________________________________________________________________
HI Rajtilak,
I am really stuck and confused here after exploring for hours on this......need your help please.
As we uploaded the file using the command below from the location ml/deep_learning/data/mnist:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("data/")
they were of the format -->
________________________________________________
>>
import tensorflow as tf
config = tf.contrib.learn.RunConfig(tf_random_seed=42) # not shown in the config
feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)
Upvote Sharednn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[200, 200, 200], n_classes=10,
feature_columns=feature_cols, config=config)
dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
dnn_clf.fit(X_train, y_train, batch_size=50, steps=30000)
Hi,
Please refer to the Project-Fashion MNIST playlist in the Machine Learning part of the course. You will find all the details there on how you can load the dataset from .gz files, and how to identify the feature columns.
Thanks.
Upvote ShareHi Rajtilak,
I have referred to the code from the Fashion ML project. Code snippet below (till the Feature Scaling part) -
filePath_train_set = '/cxldata/datasets/project/fashion-mnist/train-images-idx3-ubyte.gz'
filePath_train_label = '/cxldata/datasets/project/fashion-mnist/train-labels-idx1-ubyte.gz'
filePath_test_set = '/cxldata/datasets/project/fashion-mnist/t10k-images-idx3-ubyte.gz'
filePath_test_label = '/cxldata/datasets/project/fashion-mnist/t10k-labels-idx1-ubyte.gz'
with gzip.open(filePath_train_label, 'rb') as trainLbpath:
trainLabel = np.frombuffer(trainLbpath.read(), dtype=np.uint8,
offset=8)
with gzip.open(filePath_train_set, 'rb') as trainSetpath:
trainSet = np.frombuffer(trainSetpath.read(), dtype=np.uint8,
offset=16).reshape(len(trainLabel), 784)
with gzip.open(filePath_test_label, 'rb') as testLbpath:
testLabel = np.frombuffer(testLbpath.read(), dtype=np.uint8,
offset=8)
with gzip.open(filePath_test_set, 'rb') as testSetpath:
testSet = np.frombuffer(testSetpath.read(), dtype=np.uint8,
offset=16).reshape(len(testLabel), 784)
X_train, X_test, y_train, y_test = trainSet, testSet, trainLabel, testLabel
# Create a random seed=42
np.random.seed(42)
# Create shuffle indices of size 60000 and store it in a variable 'shuffle_index'
shuffle_index = np.random.permutation(60000)
# Shuffle the indices of X_train and y_train datasets by using 'shuffle_index' variable created above.
X_train, y_train = X_train[shuffle_index], y_train[shuffle_index]
# Feature scaling
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train.astype(np.float64))
__________________________________________________________
So looking at this my question is that will the code in question will now read as (below)
feature_cols = X_train_scaled?
___________________________________
import tensorflow as tf
config = tf.contrib.learn.RunConfig(tf_random_seed=42) # not shown in the config
feature_cols = X_train_scaled
dnn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[200, 200, 200], n_classes=10,
1 Upvote Sharefeature_columns=feature_cols, config=config)
dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
dnn_clf.fit(X_train, y_train, batch_size=50, steps=30000)
Hi,
Instead of me answering that for you, why don't you go ahead and try that and let me know if it worked.
Thanks.
Upvote ShareHi,
By the way, good job fitting the existing code in this one.
Thanks.
Upvote ShareHi,
I did that and getting the error that I am unable to comprehend. Please look into it and advise where am I going wrong?
Thanks
Hi,
Please find below the link to my personal calendar:
http://rajtilak.youcanbook.me/
Please scehdule a meeting with me sometime tomorrow, you will receive a Hangout link with the meeting invite. We can discuss this over the call.
Thanks.
Upvote ShareHi Rajtilak,
I have booked the slot for tomorrow morning 11:30 AM. Looking forward to this meeting and positive discussions.
Regards
Upvote ShareHi,
Yes, I got the notification.
Thanks.
Upvote ShareHi Rajtilak,
It was great talking to you today. As per our conversation I did try to upgrade the TensorFlow onn Jupyter from 1 to 2 but even after trying many times I could not do it. Please see the error messages. Do let know if you are able to do it at your end.
Thanks
________________________________________________
Code snippet:
1)
!tf_upgrade_v2 --infile introduction_to_artificial_neural_networks_sv_copy.ipynb --outfile TF2_introduction_to_artificial_neural_networks_sv_copy.ipynb
ERROR MESSAGE:
2)
!tf_upgrade_v2 \
--infile introduction_to_artificial_neural_networks_sv_copy.ipynb \
--outfile TF2_introduction_to_artificial_neural_networks_sv_copy.ipynb \
--reportfile tf2report.txt
ERROR MESSAGE:
__________________________________________________
Hi,
As of now I would suggest you to continue using the old function that you were using.
Thanks.
Upvote ShareHello sir,
Is therer any good book for deep learning ? please suggest books...
Upvote ShareHi,
Here is a list of ML/DL books, some of these are available for free:
https://cloudxlab.com/blog/gigantic-list-of-machine-learning-books/
Thanks.
Upvote ShareThank you sir!!
Upvote ShareThis comment has been removed.
Hi,
I have already replied to your mail, please check.
Thanks.
Upvote ShareThis comment has been removed.
This comment has been removed.
This comment has been removed.
I have a question that how can we decide the ideal no. of value given to the parameters like Step,Hidden_units,etc?
do we have to do hyperparameter optimisation like we do in machine learning?
Upvote ShareHi Anubhav,
Would suggest you to follow the lecture for an answer to your query. There are no rules to this. There is no one model fits all when it comes to ML/DL.
Thanks.
Upvote ShareThis comment has been removed.
This comment has been removed.
Also what log loss predicts?
Hi,
Logarithmic loss measures the performance of a classification model where the prediction input is a probability value between 0 and 1. The goal of our machine learning models is to minimize this value. A perfect model would have a log loss of 0. Log loss increases as the predicted probability diverges from the actual label. So predicting a probability of .012 when the actual observation label is 1 would be bad and result in a high log loss.
Thanks.
Upvote Share1.When running ANN for MNist dataset we are calculating loss for final step so what is meant by loss for final step? please explain
Upvote Share2.also in classification part it was told that if accuracy is above 95% then there is overfitting. then here accuracy is 98% is there no overfitting?
Hi,
1. Loss is nothing but a prediction error of Neural Net. And the method to calculate the loss is called Loss Function.
2. Overfitting refers to a model that models the training data too well. Overfitting happens when a model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data. So just because a model has 95% accuracy does not mean it's overfitting. Also, Neural Nets generally have higher accuracy percentages.
Thanks.
Upvote ShareOkay that means Apart from Deep learning if we use other methods like svm decision tree etc then abve 95% accuracy there is overfitting?
Also now how we will know that there is overfitting in neural network??
Upvote ShareHi,
Not really. To check for overfitting, the best way is to measure error on a training and test set. If you see low error on training set and high error on test & validation set then you have likely over-fitted the model.
Thanks.
Upvote ShareI m getting this error while running the code.
plz clarify..
Hi,
Could you please confirm from which notebook are you trying to run this code?
Thanks.
-- Rajtilak Bhattacharjee
Upvote ShareHello
Upvote ShareI m running using cloudxlab , jupiter notebook only.
Hi,
I understand you are trying to run a Jupyter notebook provided by us, but could you please point out to me which notebook you are trying to run because the code looks somewhat different.
Thanks.
-- Rajtilak Bhattacharjee
Upvote Sharehello
I have copied the code from the notebook named "introduction to artificial neural network" to a new notebook n run it. I have executed this code in ur notebook as well and it is giving same error.
Thanks n Regards..
Upvote ShareHi,
Please note that this is not an error but a warning, and the output you are getting is the intended output.
Thanks.
-- Rajtilak Bhattacharjee
Upvote ShareHi
Upvote ShareHow should we decide hidden units, steps & batch_size for for training a DNN Classifier?
Hi,
There are not hard and fast rule for these, however, here are a few thumb rules you can follow:
The number of hidden neurons should be between the size of the input layer and the size of the output layer.
The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output layer.
The number of hidden neurons should be less than twice the size of the input layer.
You can go through the below link regarding the batch size:
https://stats.stackexchange...
Thanks.
-- Rajtilak Bhattacharjee
Upvote Sharewhy the first layer is passed as an argument when calling the function for creating second layer ? Upvote Share
Hi,
This is done so that the data can flow from first layer to second layer.
Thanks.
-- Rajtilak Bhattacharjee
Upvote ShareHow softmax function is helpful in output layer in case of MLP when classes are exclusive?
Upvote ShareHi,
The softmax Regression, or the Maximum Entropy (MaxEnt) Classifier, is an activation function that turns numbers aka logits into probabilities that sum to one. Softmax function outputs a vector that represents the probability distributions of a list of potential outcomes. When the classes are exclusive, the output layer is typically modified by replacing the individual activation functions by a shared softmax function.
Thanks.
-- Rajtilak Bhattacharjee
Upvote Sharethe mnist example given for ANN how can we do in tensorflow 2.1
Upvote ShareHi, Shivom.
You can install the tensorflow in your environment and it will install the current version automatically.
But there is slight syntax change on the tensorflow, if you get the error in the tensorflow code you just search in Google and you will be able top get the correct format. But the concepts like placeholder, variables, constant and sessions, creating computational graphs execution is all same.
All the best!
Upvote ShareThis is already done the mnist avaiable in https://www.tensorflow.org/... link gives data set of different size with the one used in example
Upvote Share-- Please reply above this line --
replied
Upvote Share--
Best,
Satyajit Das
I did not get what is required ?
Upvote Share