Fixing default values for LR and Epsilon (pytorch#895)

It seems that the default values for LR and Epsilon (previously, 1E-2 and 1E-38 respectively) were different from the ones recommended by the authors (2E-3 and 1E-8, respectively). Other packages such as Keras (https://github.com/fchollet/keras/blob/master/keras/optimizers.py#L474) and Lasagne (https://github.com/Lasagne/Lasagne/blob/master/lasagne/updates.py#L612) use the suggested values as well.
760chong · Mar 22, 2017 · b9aef6b · b9aef6b
1 parent d9678c2
commit b9aef6b
Showing 1 changed file with 3 additions and 3 deletions.
diff --git a/torch/optim/adamax.py b/torch/optim/adamax.py
@@ -10,17 +10,17 @@ class Adamax(Optimizer):
     Arguments:
         params (iterable): iterable of parameters to optimize or dicts defining
             parameter groups
-        lr (float, optional): learning rate (default: 1e-2)
+        lr (float, optional): learning rate (default: 2e-3)
         betas (Tuple[float, float], optional): coefficients used for computing
             running averages of gradient and its square
         eps (float, optional): term added to the denominator to improve
-            numerical stability (default: 1e-38)
+            numerical stability (default: 1e-8)
         weight_decay (float, optional): weight decay (L2 penalty) (default: 0)
 
     __ https://arxiv.org/abs/1412.6980
     """
 
-    def __init__(self, params, lr=1e-2, betas=(0.9, 0.999), eps=1e-38,
+    def __init__(self, params, lr=2e-3, betas=(0.9, 0.999), eps=1e-8,
                  weight_decay=0):
         defaults = dict(lr=lr, betas=betas, eps=eps, weight_decay=weight_decay)
         super(Adamax, self).__init__(params, defaults)