Skip to content

Commit 842481d

Browse files
Mu Liastonzhang
Mu Li
authored andcommitted
[slides] chapter prelim
1 parent dc7a27a commit 842481d

18 files changed

+388
-456
lines changed

Jenkinsfile

+5-2
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ stage("Build and Publish") {
1515

1616
sh label: "Build Environment", script: """set -ex
1717
conda env update -n ${ENV_NAME} -f static/build.yml
18+
pip uninstall -y d2lbook
19+
pip install git+https://github.com/d2l-ai/d2l-book
1820
pip list
1921
nvidia-smi
2022
"""
@@ -35,6 +37,7 @@ stage("Build and Publish") {
3537
conda activate ${ENV_NAME}
3638
./static/cache.sh restore _build/eval_pytorch/data
3739
d2lbook build eval --tab pytorch
40+
d2lbook build slides --tab pytorch
3841
./static/cache.sh store _build/eval_pytorch/data
3942
"""
4043

@@ -60,13 +63,13 @@ stage("Build and Publish") {
6063
sh label:"Release", script:"""set -ex
6164
conda activate ${ENV_NAME}
6265
d2lbook build pkg
63-
d2lbook deploy html pdf --s3 s3://zh-v2.d2l.ai
66+
d2lbook deploy html pdf slides --s3 s3://zh-v2.d2l.ai
6467
"""
6568

6669
} else {
6770
sh label:"Publish", script:"""set -ex
6871
conda activate ${ENV_NAME}
69-
d2lbook deploy html pdf --s3 s3://preview.d2l.ai/${JOB_NAME}/
72+
d2lbook deploy html pdf slides --s3 s3://preview.d2l.ai/${JOB_NAME}/
7073
"""
7174
if (env.BRANCH_NAME.startsWith("PR-")) {
7275
pullRequest.comment("Job ${JOB_NAME}/${BUILD_NUMBER} is complete. \nCheck the results at http://preview.d2l.ai/${JOB_NAME}/")

Jenkinsfile_origin

+6-13
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,3 @@
1-
---
2-
source: https://github.com/d2l-ai/d2l-en/blob/master/Jenkinsfile
3-
commit: 9bf95b1
4-
---
5-
61
stage("Build and Publish") {
72
// such as d2l-en and d2l-zh
83
def REPO_NAME = env.JOB_NAME.split('/')[0]
@@ -17,12 +12,12 @@ stage("Build and Publish") {
1712
checkout scm
1813
// conda environment
1914
def ENV_NAME = "${TASK}-${EXECUTOR_NUMBER}";
20-
// assign two GPUs to each build
21-
def EID = EXECUTOR_NUMBER.toInteger()
22-
def CUDA_VISIBLE_DEVICES=(EID*2).toString() + ',' + (EID*2+1).toString();
2315

2416
sh label: "Build Environment", script: """set -ex
2517
conda env update -n ${ENV_NAME} -f static/build.yml
18+
conda activate ${ENV_NAME}
19+
pip uninstall -y d2lbook
20+
pip install git+https://github.com/d2l-ai/d2l-book
2621
pip list
2722
nvidia-smi
2823
"""
@@ -34,23 +29,21 @@ stage("Build and Publish") {
3429

3530
sh label: "Execute Notebooks", script: """set -ex
3631
conda activate ${ENV_NAME}
37-
export CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES}
3832
./static/cache.sh restore _build/eval/data
3933
d2lbook build eval
4034
./static/cache.sh store _build/eval/data
4135
"""
4236

4337
sh label: "Execute Notebooks [PyTorch]", script: """set -ex
4438
conda activate ${ENV_NAME}
45-
export CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES}
4639
./static/cache.sh restore _build/eval_pytorch/data
4740
d2lbook build eval --tab pytorch
41+
d2lbook build slides --tab pytorch
4842
./static/cache.sh store _build/eval_pytorch/data
4943
"""
5044

5145
sh label: "Execute Notebooks [TensorFlow]", script: """set -ex
5246
conda activate ${ENV_NAME}
53-
export CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES}
5447
./static/cache.sh restore _build/eval_tensorflow/data
5548
export TF_CPP_MIN_LOG_LEVEL=3
5649
d2lbook build eval --tab tensorflow
@@ -71,7 +64,7 @@ stage("Build and Publish") {
7164
sh label:"Release", script:"""set -ex
7265
conda activate ${ENV_NAME}
7366
d2lbook build pkg
74-
d2lbook deploy html pdf pkg colab sagemaker --s3 s3://preview.d2l.ai/${JOB_NAME}/
67+
d2lbook deploy html pdf pkg colab sagemaker slides --s3 s3://en.d2l.ai/
7568
"""
7669

7770
sh label:"Release d2l", script:"""set -ex
@@ -83,7 +76,7 @@ stage("Build and Publish") {
8376
} else {
8477
sh label:"Publish", script:"""set -ex
8578
conda activate ${ENV_NAME}
86-
d2lbook deploy html pdf --s3 s3://preview.d2l.ai/${JOB_NAME}/
79+
d2lbook deploy html pdf slides --s3 s3://preview.d2l.ai/${JOB_NAME}/
8780
"""
8881
if (env.BRANCH_NAME.startsWith("PR-")) {
8982
pullRequest.comment("Job ${JOB_NAME}/${BUILD_NUMBER} is complete. \nCheck the results at http://preview.d2l.ai/${JOB_NAME}/")

chapter_preliminaries/autograd.md

+15-22
Original file line numberDiff line numberDiff line change
@@ -6,43 +6,36 @@
66
深度学习框架通过自动计算导数(即 *自动求导*(automatic differentiation))来加快这项工作。实际中,根据我们设计的模型,系统会构建一个 *计算图*(computational graph),来跟踪数据通过若干操作组合起来产生输出。自动求导使系统能够随后反向传播梯度。
77
这里,*反向传播*(backpropagate)只是意味着跟踪整个计算图,填充关于每个参数的偏导数。
88

9-
```{.python .input}
10-
from mxnet import autograd, np, npx
11-
npx.set_np()
12-
```
13-
14-
```{.python .input}
15-
#@tab pytorch
16-
import torch
17-
```
18-
19-
```{.python .input}
20-
#@tab tensorflow
21-
import tensorflow as tf
22-
```
239

2410
## 一个简单的例子
2511

26-
作为一个演示例子,假设我们想对函数 $y = 2\mathbf{x}^{\top}\mathbf{x}$关于列向量 $\mathbf{x}$求导。首先,我们创建变量 `x` 并为其分配一个初始值。
12+
作为一个演示例子,(**假设我们想对函数 $y = 2\mathbf{x}^{\top}\mathbf{x}$关于列向量 $\mathbf{x}$求导**)。首先,我们创建变量 `x` 并为其分配一个初始值。
2713

2814
```{.python .input}
15+
from mxnet import autograd, np, npx
16+
npx.set_np()
17+
2918
x = np.arange(4.0)
3019
x
3120
```
3221

3322
```{.python .input}
3423
#@tab pytorch
24+
import torch
25+
3526
x = torch.arange(4.0)
3627
x
3728
```
3829

3930
```{.python .input}
4031
#@tab tensorflow
32+
import tensorflow as tf
33+
4134
x = tf.range(4, dtype=tf.float32)
4235
x
4336
```
4437

45-
在我们计算$y$关于$\mathbf{x}$的梯度之前,我们需要一个地方来存储梯度。
38+
[**在我们计算$y$关于$\mathbf{x}$的梯度之前,我们需要一个地方来存储梯度。**]
4639
重要的是,我们不会在每次对一个参数求导时都分配新的内存。因为我们经常会成千上万次地更新相同的参数,每次都分配新的内存可能很快就会将内存耗尽。注意,标量函数关于向量$\mathbf{x}$的梯度是向量,并且与$\mathbf{x}$具有相同的形状。
4740

4841
```{.python .input}
@@ -64,7 +57,7 @@ x.grad # 默认值是None
6457
x = tf.Variable(x)
6558
```
6659

67-
现在让我们计算 $y$。
60+
(**现在让我们计算 $y$。**)
6861

6962
```{.python .input}
7063
# 把代码放到`autograd.record`内,以建立计算图
@@ -87,7 +80,7 @@ with tf.GradientTape() as t:
8780
y
8881
```
8982

90-
`x` 是一个长度为 4 的向量,计算 `x``x` 的内积,得到了我们赋值给 `y` 的标量输出。接下来,我们可以通过调用反向传播函数来自动计算`y`关于`x` 每个分量的梯度,并打印这些梯度。
83+
`x` 是一个长度为 4 的向量,计算 `x``x` 的内积,得到了我们赋值给 `y` 的标量输出。接下来,我们可以[**通过调用反向传播函数来自动计算`y`关于`x` 每个分量的梯度**],并打印这些梯度。
9184

9285
```{.python .input}
9386
y.backward()
@@ -122,7 +115,7 @@ x.grad == 4 * x
122115
x_grad == 4 * x
123116
```
124117

125-
现在让我们计算 `x` 的另一个函数。
118+
[**现在让我们计算 `x` 的另一个函数。**]
126119

127120
```{.python .input}
128121
with autograd.record():
@@ -151,7 +144,7 @@ t.gradient(y, x) # 被新计算的梯度覆盖
151144

152145
`y` 不是标量时,向量`y`关于向量`x`的导数的最自然解释是一个矩阵。对于高阶和高维的 `y``x`,求导的结果可以是一个高阶张量。
153146

154-
然而,虽然这些更奇特的对象确实出现在高级机器学习中(包括深度学习中),但当我们调用向量的反向计算时,我们通常会试图计算一批训练样本中每个组成部分的损失函数的导数。这里,我们的目的不是计算微分矩阵,而是批量中每个样本单独计算的偏导数之和。
147+
然而,虽然这些更奇特的对象确实出现在高级机器学习中(包括[**深度学习中**]),但当我们调用向量的反向计算时,我们通常会试图计算一批训练样本中每个组成部分的损失函数的导数。这里(**,我们的目的不是计算微分矩阵,而是批量中每个样本单独计算的偏导数之和。**)
155148

156149
```{.python .input}
157150
# 当我们对向量值变量`y`(关于`x`的函数)调用`backward`时,
@@ -167,7 +160,7 @@ x.grad # 等价于y = sum(x * x)
167160
# 对非标量调用`backward`需要传入一个`gradient`参数,该参数指定微分函数关于`self`的梯度。在我们的例子中,我们只想求偏导数的和,所以传递一个1的梯度是合适的
168161
x.grad.zero_()
169162
y = x * x
170-
# 等价于y.backward(torch.ones(len(x)))
163+
# 等价于y.backward(torch.ones(len(x)))
171164
y.sum().backward()
172165
x.grad
173166
```
@@ -181,7 +174,7 @@ t.gradient(y, x) # 等价于 `y = tf.reduce_sum(x * x)`
181174

182175
## 分离计算
183176

184-
有时,我们希望将某些计算移动到记录的计算图之外
177+
有时,我们希望[**将某些计算移动到记录的计算图之外**]
185178
例如,假设`y`是作为`x`的函数计算的,而`z`则是作为`y``x`的函数计算的。
186179
现在,想象一下,我们想计算 `z` 关于 `x` 的梯度,但由于某种原因,我们希望将 `y` 视为一个常数,并且只考虑到 `x``y`被计算后发挥的作用。
187180

chapter_preliminaries/autograd_origin.md

+29-41
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,3 @@
1-
---
2-
source: https://github.com/d2l-ai/d2l-en/blob/master/chapter_preliminaries/autograd.md
3-
commit: 9e55a9c
4-
---
5-
61
# Automatic Differentiation
72
:label:`sec_autograd`
83

@@ -24,49 +19,42 @@ Automatic differentiation enables the system to subsequently backpropagate gradi
2419
Here, *backpropagate* simply means to trace through the computational graph,
2520
filling in the partial derivatives with respect to each parameter.
2621

27-
```{.python .input}
28-
from mxnet import autograd, np, npx
29-
npx.set_np()
30-
```
31-
32-
```{.python .input}
33-
#@tab pytorch
34-
import torch
35-
```
36-
37-
```{.python .input}
38-
#@tab tensorflow
39-
import tensorflow as tf
40-
```
4122

4223
## A Simple Example
4324

4425
As a toy example, say that we are interested
45-
in differentiating the function
26+
in (**differentiating the function
4627
$y = 2\mathbf{x}^{\top}\mathbf{x}$
47-
with respect to the column vector $\mathbf{x}$.
28+
with respect to the column vector $\mathbf{x}$.**)
4829
To start, let us create the variable `x` and assign it an initial value.
4930

5031
```{.python .input}
32+
from mxnet import autograd, np, npx
33+
npx.set_np()
34+
5135
x = np.arange(4.0)
5236
x
5337
```
5438

5539
```{.python .input}
5640
#@tab pytorch
41+
import torch
42+
5743
x = torch.arange(4.0)
5844
x
5945
```
6046

6147
```{.python .input}
6248
#@tab tensorflow
49+
import tensorflow as tf
50+
6351
x = tf.range(4, dtype=tf.float32)
6452
x
6553
```
6654

67-
Before we even calculate the gradient
55+
[**Before we even calculate the gradient
6856
of $y$ with respect to $\mathbf{x}$,
69-
we will need a place to store it.
57+
we will need a place to store it.**]
7058
It is important that we do not allocate new memory
7159
every time we take a derivative with respect to a parameter
7260
because we will often update the same parameters
@@ -95,7 +83,7 @@ x.grad # The default value is None
9583
x = tf.Variable(x)
9684
```
9785

98-
Now let us calculate $y$.
86+
(**Now let us calculate $y$.**)
9987

10088
```{.python .input}
10189
# Place our code inside an `autograd.record` scope to build the computational
@@ -122,8 +110,8 @@ y
122110
Since `x` is a vector of length 4,
123111
an inner product of `x` and `x` is performed,
124112
yielding the scalar output that we assign to `y`.
125-
Next, we can automatically calculate the gradient of `y`
126-
with respect to each component of `x`
113+
Next, [**we can automatically calculate the gradient of `y`
114+
with respect to each component of `x`**]
127115
by calling the function for backpropagation and printing the gradient.
128116

129117
```{.python .input}
@@ -143,8 +131,8 @@ x_grad = t.gradient(y, x)
143131
x_grad
144132
```
145133

146-
The gradient of the function $y = 2\mathbf{x}^{\top}\mathbf{x}$
147-
with respect to $\mathbf{x}$ should be $4\mathbf{x}$.
134+
(**The gradient of the function $y = 2\mathbf{x}^{\top}\mathbf{x}$
135+
with respect to $\mathbf{x}$ should be $4\mathbf{x}$.**)
148136
Let us quickly verify that our desired gradient was calculated correctly.
149137

150138
```{.python .input}
@@ -161,7 +149,7 @@ x.grad == 4 * x
161149
x_grad == 4 * x
162150
```
163151

164-
Now let us calculate another function of `x`.
152+
[**Now let us calculate another function of `x`.**]
165153

166154
```{.python .input}
167155
with autograd.record():
@@ -172,9 +160,9 @@ x.grad # Overwritten by the newly calculated gradient
172160

173161
```{.python .input}
174162
#@tab pytorch
175-
# PyTorch accumulates the gradient in default, we need to clear the previous
163+
# PyTorch accumulates the gradient in default, we need to clear the previous
176164
# values
177-
x.grad.zero_()
165+
x.grad.zero_()
178166
y = x.sum()
179167
y.backward()
180168
x.grad
@@ -196,13 +184,13 @@ For higher-order and higher-dimensional `y` and `x`,
196184
the differentiation result could be a high-order tensor.
197185

198186
However, while these more exotic objects do show up
199-
in advanced machine learning (including in deep learning),
200-
more often when we are calling backward on a vector,
187+
in advanced machine learning (including [**in deep learning**]),
188+
more often (**when we are calling backward on a vector,**)
201189
we are trying to calculate the derivatives of the loss functions
202190
for each constituent of a *batch* of training examples.
203-
Here, our intent is not to calculate the differentiation matrix
204-
but rather the sum of the partial derivatives
205-
computed individually for each example in the batch.
191+
Here, (**our intent is**) not to calculate the differentiation matrix
192+
but rather (**the sum of the partial derivatives
193+
computed individually for each example**) in the batch.
206194

207195
```{.python .input}
208196
# When we invoke `backward` on a vector-valued variable `y` (function of `x`),
@@ -236,8 +224,8 @@ t.gradient(y, x) # Same as `y = tf.reduce_sum(x * x)`
236224

237225
## Detaching Computation
238226

239-
Sometimes, we wish to move some calculations
240-
outside of the recorded computational graph.
227+
Sometimes, we wish to [**move some calculations
228+
outside of the recorded computational graph.**]
241229
For example, say that `y` was calculated as a function of `x`,
242230
and that subsequently `z` was calculated as a function of both `y` and `x`.
243231
Now, imagine that we wanted to calculate
@@ -309,10 +297,10 @@ t.gradient(y, x) == 2 * x
309297
## Computing the Gradient of Python Control Flow
310298

311299
One benefit of using automatic differentiation
312-
is that even if building the computational graph of a function
313-
required passing through a maze of Python control flow
300+
is that [**even if**] building the computational graph of (**a function
301+
required passing through a maze of Python control flow**)
314302
(e.g., conditionals, loops, and arbitrary function calls),
315-
we can still calculate the gradient of the resulting variable.
303+
(**we can still calculate the gradient of the resulting variable.**)
316304
In the following snippet, note that
317305
the number of iterations of the `while` loop
318306
and the evaluation of the `if` statement

0 commit comments

Comments
 (0)