Skip to content

Commit a574b7b

Browse files
authored
Update weight-decay.md
1 parent eb8bea8 commit a574b7b

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

chapter_deep-learning-basics/weight-decay.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ $$\ell(w_1, w_2, b) = \frac{1}{n} \sum_{i=1}^n \frac{1}{2}\left(x_1^{(i)} w_1 +
1313

1414
为例,其中$w_1, w_2$是权重参数,$b$是偏差参数,样本$i$的输入为$x_1^{(i)}, x_2^{(i)}$,标签为$y^{(i)}$,样本数为$n$。将权重参数用向量$\boldsymbol{w} = [w_1, w_2]$表示,带有$L_2$范数惩罚项的新损失函数为
1515

16-
$$\ell(w_1, w_2, b) + \frac{\lambda}{2n} \|\boldsymbol{w}\|^2,$$
16+
$$\ell(w_1, w_2, b) + \frac{\lambda}{2} \|\boldsymbol{w}\|^2,$$
1717

1818
其中超参数$\lambda > 0$。当权重参数均为0时,惩罚项最小。当$\lambda$较大时,惩罚项在损失函数中的比重较大,这通常会使学到的权重参数的元素较接近0。当$\lambda$设为0时,惩罚项完全不起作用。上式中$L_2$范数平方$\|\boldsymbol{w}\|^2$展开后得到$w_1^2 + w_2^2$。有了$L_2$范数惩罚项后,在小批量随机梯度下降中,我们将[“线性回归”](linear-regression.md)一节中权重$w_1$和$w_2$的迭代方式更改为
1919

0 commit comments

Comments
 (0)