This project implements several optimization techniques commonly used in machine learning: Stochastic Gradient Descent (SGD), Adam, Momentum, and Nesterov Accelerated Gradient (NAG). It includes code to test these optimizers on a benchmark cost function, the Rosenbrock function.
This project aims to compare different optimization techniques in terms of their convergence behavior on the Rosenbrock function, a well-known optimization test function. The optimizers implemented include:
- Stochastic Gradient Descent (SGD)
- Adam
- Momentum
- Nesterov Accelerated Gradient (NAG)
Each optimizer is evaluated over 100,000 iterations, and the convergence (cost values) is recorded for analysis.
The following optimization algorithms are implemented:
SGD is a simple and efficient optimization method that updates parameters using the gradient of the cost function with respect to the parameters.
Adam (Adaptive Moment Estimation) is a popular method that adapts the learning rate based on the first and second moments of the gradients.
Momentum accelerates SGD by adding a fraction of the previous update to the current one, which helps speed up convergence.
NAG is a variation of momentum where the gradient is calculated after the momentum step, leading to faster convergence.