Skip to content

Commit

Permalink
[DOCS] Add training on CPU sections to docs (dmlc#3398)
Browse files Browse the repository at this point in the history
  • Loading branch information
mszarma authored Oct 14, 2021
1 parent 1886306 commit a47ab71
Show file tree
Hide file tree
Showing 4 changed files with 58 additions and 2 deletions.
6 changes: 4 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -200,12 +200,14 @@
'../../tutorials/large',
'../../tutorials/dist',
'../../tutorials/models',
'../../tutorials/multi'] # path to find sources
'../../tutorials/multi',
'../../tutorials/cpu'] # path to find sources
gallery_dirs = ['tutorials/blitz/',
'tutorials/large/',
'tutorials/dist/',
'tutorials/models/',
'tutorials/multi/'] # path to generate docs
'tutorials/multi/',
'tutorials/cpu'] # path to generate docs
reference_url = {
'dgl' : None,
'numpy': 'http://docs.scipy.org/doc/numpy/',
Expand Down
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ Welcome to Deep Graph Library Tutorials and Documentation
guide/index
guide_cn/index
tutorials/large/index
tutorials/cpu/index
tutorials/multi/index
tutorials/dist/index
tutorials/models/index
Expand Down
2 changes: 2 additions & 0 deletions tutorials/cpu/README.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Training on CPUs
=========================
51 changes: 51 additions & 0 deletions tutorials/cpu/cpu_best_practises.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
"""
CPU Best Pratices
=====================================================
This chapter focus on providing best practises for environment setup
to get the best performance during training and inference on the CPU.
Intel
`````````````````````````````
Hyper-treading
---------------------------
For specific workloads as GNN’s domain, suggested default setting for having best performance
is to turn off hyperthreading.
Turning off the hyper threading feature can be done at BIOS [#f1]_ or operating system level [#f2]_ [#f3]_ .
OpenMP settings
---------------------------
During training on CPU, the training and dataloading part need to be maintained simultaneously.
Best performance of parallelization in OpenMP
can be achieved by setting up the optimal number of working threads and dataloading workers.
**GNU OpenMP**
Default BKM for setting the number of OMP threads with Pytorch backend:
``OMP_NUM_THREADS`` = number of physical cores – ``num_workers``
Number of physical cores can be checked by using ``lscpu`` ("Core(s) per socket")
or ``nproc`` command in Linux command line.
Below simple bash script example for setting the OMP threads and ``pytorch`` backend dataloader workers:
.. code:: bash
cores=`nproc`
num_workers=4
export OMP_NUM_THREADS=$(($cores-$num_workers))
python script.py --gpu -1 --num_workers=$num_workers
Depending on the dataset, model and CPU optimal number of dataloader workers and OpemMP threads may vary
but close to the general default advise presented above [#f4]_ .
.. rubric:: Footnotes
.. [#f1] https://www.intel.com/content/www/us/en/support/articles/000007645/boards-and-kits/desktop-boards.html
.. [#f2] https://aws.amazon.com/blogs/compute/disabling-intel-hyper-threading-technology-on-amazon-linux/
.. [#f3] https://aws.amazon.com/blogs/compute/disabling-intel-hyper-threading-technology-on-amazon-ec2-windows-instances/
.. [#f4] https://software.intel.com/content/www/us/en/develop/articles/how-to-get-better-performance-on-pytorchcaffe2-with-intel-acceleration.html
"""

0 comments on commit a47ab71

Please sign in to comment.