Contents Problems with Gradient Descent Contour Maps Momentum Based Gradient Descent Nesterov Accelerated Gradient Descent Stochastic Gradient Descent Scheduling Learning Rate