Training Models

54 / 56

In mini-batch gradient descent, the path taken is less erratic as compared to SGD?