Skip to content

Commit

Permalink
add translation and fix deadlink (PaddlePaddle#814)
Browse files Browse the repository at this point in the history
* hotfix deadlink (PaddlePaddle#811)

* Update native_infer_en.md (PaddlePaddle#787)

* Update install_Windows_en.md (PaddlePaddle#790)

* Update install_Windows_en.md

* Update install_Windows_en.md

* Update cluster_howto_en.rst (PaddlePaddle#791)

* Update cluster_howto_en.rst

* Update cluster_howto_en.rst

* Update doc/fluid/user_guides/howto/training/cluster_howto_en.rst

Co-Authored-By: acosta123 <[email protected]>

* Update doc/fluid/user_guides/howto/training/cluster_howto_en.rst

Co-Authored-By: acosta123 <[email protected]>

* Update cluster_howto_en.rst

* Update index_cn.rst (PaddlePaddle#813)
  • Loading branch information
shanyi15 authored Apr 19, 2019
1 parent e9fa88b commit e14a084
Show file tree
Hide file tree
Showing 4 changed files with 36 additions and 10 deletions.
12 changes: 7 additions & 5 deletions doc/fluid/advanced_usage/deploy/inference/native_infer_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ There are two modes in term of memory management in `PaddleBuf` :
In the two modes, the first is more convenient while the second strictly controls memory management to facilitate integration with `tcmalloc` and other libraries.
### Upgrade performance based on contrib::AnalysisConfig (Prerelease)
### Upgrade performance based on contrib::AnalysisConfig
AnalyisConfig is at the stage of pre-release and protected by `namespace contrib` , which may be adjusted in the future.
Expand All @@ -106,9 +106,11 @@ The usage of `AnalysisConfig` is similiar with that of `NativeConfig` but the fo
```c++
AnalysisConfig config;
config.model_dir = xxx;
config.use_gpu = false; // GPU optimization is not supported at present
config.specify_input_name = true; // it needs to set name of input
config.SetModel(dirname); // set the directory of the model
config.EnableUseGpu(100, 0 /*gpu id*/); // use GPU,or
config.DisableGpu(); // use CPU
config.SwitchSpecifyInputNames(true); // need to appoint the name of your input
config.SwitchIrOptim(); // turn on the optimization switch,and a sequence of optimizations will be executed in operation
```

Note that input PaddleTensor needs to be allocated. Previous examples need to be revised as follows:
Expand Down Expand Up @@ -147,7 +149,7 @@ For more specific examples, please refer to[LoD-Tensor Instructions](../../../us

1. If the CPU type permits, it's best to use the versions with support for AVX and MKL.
2. Reuse input and output `PaddleTensor` to avoid frequent memory allocation resulting in low performance
3. Try to replace `NativeConfig` with `AnalysisConfig` to perform optimization for CPU inference
3. Try to replace `NativeConfig` with `AnalysisConfig` to perform optimization for CPU or GPU inference

## Code Demo

Expand Down
11 changes: 8 additions & 3 deletions doc/fluid/beginners_guide/install/install_Windows_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,8 @@ This instruction will show you how to install PaddlePaddle on Windows. The foll

**Note** :

* The current version does not support NCCL, distributed training, AVX, warpctc and MKL related functions.
* The current version does not support NCCL, distributed training related functions.

* Currently, only PaddlePaddle for CPU is supported on Windows.



Expand All @@ -30,14 +29,20 @@ Version of pip or pip3 should be equal to or above 9.0.1 .

* Install PaddlePaddle

* ***CPU version of PaddlePaddle***:
Execute `pip install paddlepaddle` or `pip3 install paddlepaddle` to download and install PaddlePaddle.


* ***GPU version of PaddlePaddle***:
Execute `pip install paddlepaddle-gpu`(python2.7) or `pip3 install paddlepaddle-gpu`(python3.x) to download and install PaddlePaddle.

## ***Verify installation***

After completing the installation, you can use `python` or `python3` to enter the python interpreter and then use `import paddle.fluid` to verify that the installation was successful.

## ***How to uninstall***

* ***CPU version of PaddlePaddle***:
Use the following command to uninstall PaddlePaddle : `pip uninstallpaddlepaddle `or `pip3 uninstall paddlepaddle`

* ***GPU version of PaddlePaddle***:
Use the following command to uninstall PaddlePaddle : `pip uninstall paddlepaddle-gpu` or `pip3 uninstall paddlepaddle-gpu`
21 changes: 20 additions & 1 deletion doc/fluid/user_guides/howto/training/cluster_howto_en.rst
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,25 @@ For example:
Currently, distributed training using NCCL2 only supports synchronous training. The distributed training using NCCL2 mode is more suitable for the model which is relatively large and needs \
synchronous training and GPU training. If the hardware device supports RDMA and GPU Direct, this can achieve high distributed training performance.

Start Up NCCL2 Distributed Training in Muti-Process Mode
++++++++++++++++++++++++++++++++++++++++++++++

Usually you can get better multi-training performance by using multi-process mode to start up NCCL2 distributed training assignment. Paddle provides :code:`paddle.distributed.launch` module to start up multi-process assignment, after which each training process will use an independent GPU device.

Attention during usage:

* set the number of nodes: set the number of nodes of an assignment by the environment variable :code:`PADDLE_NUM_TRAINERS` , and this variable will also be set in every training process.
* set the number of devices of each node: by activating the parameter :code:`--gpus` , you can set the number of GPU devices of each node, and the sequence number of each process will be set in the environment variable :code:`PADDLE_TRAINER_ID` automatically.
* data segment: mult-process mode means one process in each device. Generally, each process manages a part of training data, in order to make sure that all processes can manage the whole data set.
* entrance file: entrance file is the training script for actual startup.
* journal: for each training process, the joural is saved in the default :code:`./mylog` directory, and you can assign by the parameter :code:`--log_dir` .

startup example:

.. code-block:: bash
> PADDLE_NUM_TRAINERS=<TRAINER_COUNT> python -m paddle.distributed.launch train.py --gpus <NUM_GPUS_ON_HOSTS> <ENTRYPOINT_SCRIPT> --arg1 --arg2 ...
Important Notes on NCCL2 Distributed Training
++++++++++++++++++++++++++++++++++++++++++++++

Expand All @@ -215,7 +234,7 @@ exit at the final iteration. There are two common ways:
- Each node only trains fixed number of batches per pass, which is controlled by python codes. If a node has more data than this fixed amount, then these
marginal data will not be trained.

**Note** : If there are multiple network devices in the system, you need to manually specify the devices used by NCCL2.
**Note** : If there are multiple network devices in the system, you need to manually specify the devices used by NCCL2.

Assuming you need to use :code:`eth2` as the communication device, you need to set the following environment variables:

Expand Down
2 changes: 1 addition & 1 deletion doc/fluid/user_guides/index_cn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
- `训练神经网络 <../user_guides/howto/training/index_cn.html>`_:介绍如何使用 Fluid 进行单机训练、多机训练、以及保存和载入模型变量


- `DyGraph模式 <../user_guides/howto/dygraph/DyGraph.md>`_:介绍在 Fluid 下使用DyGraph
- `DyGraph模式 <../user_guides/howto/dygraph/DyGraph.html>`_:介绍在 Fluid 下使用DyGraph

- `模型评估与调试 <../user_guides/howto/evaluation_and_debugging/index_cn.html>`_:介绍在 Fluid 下进行模型评估和调试的方法,包括:

Expand Down

0 comments on commit e14a084

Please sign in to comment.