Skip to content

Commit

Permalink
Fix amp guide doc picture miss error (PaddlePaddle#4979)
Browse files Browse the repository at this point in the history
* refine paddle.amp.decorate input parameter of optimizer

* Revert "refine paddle.amp.decorate input parameter of optimizer"

This reverts commit 2899b2a.

* fix picture path

* refine code

* refine code

* refine code

* refine code
  • Loading branch information
zhangbo9674 authored Jul 6, 2022
1 parent 8b5d5b4 commit 3cee42f
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 6 deletions.
6 changes: 3 additions & 3 deletions docs/guides/performance_improving/amp_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
对比 float32 与 float16 / bfloat16 的浮点格式,如图1所示:

<figure align="center">
<img src="./images/float.png" width="400" alt='missing'>
<img src="https://github.com/PaddlePaddle/docs/blob/develop/docs/guides/performance_improving/images/float.png?raw=true" width="600" alt='missing' align="center"/>
<figcaption><center>图 1. 半精度和单精度数据格式示意图</center></figcaption>
</figure>

Expand All @@ -30,7 +30,7 @@
飞桨框架采用了 **auto_cast 策略**实现模型训练过程中计算精度的自动转换及使用。通常情况下,模型参数使用单精度浮点格式存储(float32),在训练过程中,将模型参数从单精度浮点数(float32)转换为半精度浮点数(float16 或 bfloat16)参与前向计算,并得到半精度浮点数表示中间状态,然后使用半精度浮点数计算参数梯度,最后将参数梯度转换为单精度浮点数格式后,更新模型参数。计算过程如下图2所示:

<figure align="center">
<img src="./images/auto_cast.png" width="400" alt='missing'>
<img src="https://github.com/PaddlePaddle/docs/blob/develop/docs/guides/performance_improving/images/auto_cast.png?raw=true" width="700" alt='missing' align="center"/>
<figcaption><center>图 2. 混合精度计算过程示意图</center></figcaption>
</figure>

Expand All @@ -39,7 +39,7 @@
当模型参数在训练前即使用半精度浮点格式存数时(float16 / bfloat16),训练过程中将省去图 2 中的 cast 操作,可进一步提升模型训练性能,但是需要注意模型参数采用低精度数据类型进行存储,可能对模型最终的训练精度带来影响。计算过程如下图3所示:

<figure align="center">
<img src="./images/auto_cast_o2.png" width="400" alt='missing'>
<img src="https://github.com/PaddlePaddle/docs/blob/develop/docs/guides/performance_improving/images/auto_cast_o2.png?raw=true" width="700" alt='missing' align="center"/>
<figcaption><center>图 3. float16计算过程示意图</center></figcaption>
</figure>

Expand Down
6 changes: 3 additions & 3 deletions docs/guides/performance_improving/amp_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Both [float16](https://en.wikipedia.org/wiki/Half-precision_floating-point_forma
Compare the floating-point formats of float32 and float16 / bfloat16, as shown in Figure 1:

<figure align="center">
<img src="./images/float.png" width="400" alt='missing'>
<img src="https://github.com/PaddlePaddle/docs/blob/develop/docs/guides/performance_improving/images/float.png?raw=true" width="600" alt='missing' align="center"/>
<figcaption><center>Figure 1. Floating-point formats of float32 and float16 / bfloat16</center></figcaption>
</figure>

Expand All @@ -30,7 +30,7 @@ The above data types have the following numerical characteristics:
Paddle adopts **auto_cast strategy** realizes the automatic conversion and use of calculation accuracy during model training. Generally, the model parameters are stored in the single precision floating-point format (float32). In the training process, the model parameters are converted from float32 to the half precision floating-point number (float16 or bfloat16) to participate in the forward calculation, and the half precision floating-point number represents the intermediate state. Then the half precision floating-point number is used to calculate the parameter gradient, and finally the parameter gradient is converted to the single precision floating-point number format, Update model parameters. The calculation process is shown in Figure 2 below:

<figure align="center">
<img src="./images/auto_cast.png" width="400" alt='missing'>
<img src="https://github.com/PaddlePaddle/docs/blob/develop/docs/guides/performance_improving/images/auto_cast.png?raw=true" width="700" alt='missing' align="center"/>
<figcaption><center>Figure 2. auto_cast calculation process</center></figcaption>
</figure>

Expand All @@ -39,7 +39,7 @@ The logic in the blue dashed box in the figure2 is the parameter accuracy conver
When the model parameters are stored in half precision floating-point format (float16 / bfloat16) before training, the cast operation in Figure 2 will be omitted in the training process, which can further improve the model training performance. However, it should be noted that the model parameters are stored in low precision data types, which may affect the final training accuracy of the model. The calculation process is shown in Figure 3 below:

<figure align="center">
<img src="./images/auto_cast_o2.png" width="400" alt='missing'>
<img src="https://github.com/PaddlePaddle/docs/blob/develop/docs/guides/performance_improving/images/auto_cast_o2.png?raw=true" width="700" alt='missing' align="center"/>
<figcaption><center>Figure 3. float16 calculation process</center></figcaption>
</figure>

Expand Down

0 comments on commit 3cee42f

Please sign in to comment.