With SELU activation function even a 100 layer deep neural network preserves roughly mean 0 and standard deviation 1 across all layers avoiding the exploding/vanishing gradients problem.
Note - Having trouble with the assessment engine? Follow the steps listed here
No hints are availble for this assesment
Answer is not availble for this assesment
Loading comments...