For linear regression problems, MSE is a convex function and hence no matter where is the random initialization, Gradient Descent is guaranteed to approach arbitrarily close the global minimum (if we wait long enough and if the learning rate is not too high).

