Skip to content

Commit

Permalink
[TF] TF backend fix and new logic to choose backend (dmlc#1393)
Browse files Browse the repository at this point in the history
* TF backend fix and new logic to choose backend

* fix

* fix

* fix

* fix

* fix backend

* fix

* dlpack alignment

* add flag

* flag

* lint

* lint

* remove unused

* several fixes

Co-authored-by: Minjie Wang <[email protected]>
  • Loading branch information
VoVAllen and jermainewang authored Mar 30, 2020
1 parent 4b4186f commit e9440ac
Show file tree
Hide file tree
Showing 23 changed files with 217 additions and 107 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ DGL should work on

DGL requires Python 3.5 or later.

Right now, DGL works on [PyTorch](https://pytorch.org) 1.1.0+, [MXNet](https://mxnet.apache.org) nightly build, and [TensorFlow](https://tensorflow.org) 2.0+.
Right now, DGL works on [PyTorch](https://pytorch.org) 1.2.0+, [MXNet](https://mxnet.apache.org) 1.5.1+, and [TensorFlow](https://tensorflow.org) 2.1.0+.


### Using anaconda
Expand Down
18 changes: 9 additions & 9 deletions docker/install/conda_env/mxnet_cpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,12 @@ dependencies:
- pip:
- mxnet
- pytest
- nose
- numpy
- cython
- scipy
- networkx
- matplotlib
- nltk
- requests[security]
- tqdm
- nose
- numpy
- cython
- scipy
- networkx
- matplotlib
- nltk
- requests[security]
- tqdm
18 changes: 9 additions & 9 deletions docker/install/conda_env/mxnet_gpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,12 @@ dependencies:
- pip:
- mxnet-cu101
- pytest
- nose
- numpy
- cython
- scipy
- networkx
- matplotlib
- nltk
- requests[security]
- tqdm
- nose
- numpy
- cython
- scipy
- networkx
- matplotlib
- nltk
- requests[security]
- tqdm
21 changes: 11 additions & 10 deletions docker/install/conda_env/tensorflow_cpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,16 @@ dependencies:
- python=3.6.9
- pip
- pip:
- tensorflow==2.1.0rc1
- tensorflow==2.2.0rc1
# - tf-nightly==2.2.0.dev20200327
- tfdlpack
- pytest
- nose
- numpy
- cython
- scipy
- networkx
- matplotlib
- nltk
- requests[security]
- tqdm
- nose
- numpy
- cython
- scipy
- networkx
- matplotlib
- nltk
- requests[security]
- tqdm
22 changes: 11 additions & 11 deletions docker/install/conda_env/tensorflow_gpu.yml
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@

name: tensorflow-ci
dependencies:
- python=3.6.9
- pip
- pip:
- tensorflow-gpu==2.1.0rc1
- tensorflow==2.2.0rc1
# - tf-nightly==2.2.0.dev20200327
- tfdlpack-gpu
- pytest
- nose
- numpy
- cython
- scipy
- networkx
- matplotlib
- nltk
- requests[security]
- tqdm
- nose
- numpy
- cython
- scipy
- networkx
- matplotlib
- nltk
- requests[security]
- tqdm
18 changes: 9 additions & 9 deletions docker/install/conda_env/torch_cpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,12 @@ dependencies:
- torch
- torchvision
- pytest
- nose
- numpy
- cython
- scipy
- networkx
- matplotlib
- nltk
- requests[security]
- tqdm
- nose
- numpy
- cython
- scipy
- networkx
- matplotlib
- nltk
- requests[security]
- tqdm
18 changes: 9 additions & 9 deletions docker/install/conda_env/torch_gpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,12 @@ dependencies:
- torch
- torchvision
- pytest
- nose
- numpy
- cython
- scipy
- networkx
- matplotlib
- nltk
- requests[security]
- tqdm
- nose
- numpy
- cython
- scipy
- networkx
- matplotlib
- nltk
- requests[security]
- tqdm
27 changes: 13 additions & 14 deletions docs/source/install/backend.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,21 @@
Working with different backends
===============================

DGL supports PyTorch, MXNet and Tensorflow backends. To change them, set the ``DGLBACKEND``
environcment variable. The default backend is PyTorch.
DGL supports PyTorch, MXNet and Tensorflow backends.
DGL will choose the backend on the following options (high priority to low priority)
- `DGLBACKEND` environment
- You can use `DGLBACKEND=[BACKEND] python gcn.py ...` to specify the backend
- Or `export DGLBACKEND=[BACKEND]` to set the global environment variable
- `config.json` file under "~/.dgl"
- You can use `python -m dgl.backend.set_default_backend [BACKEND]` to set the default backend

Currently BACKEND can be chosen from mxnet, pytorch, tensorflow.

PyTorch backend
---------------

Export ``DGLBACKEND`` as ``pytorch`` to specify PyTorch backend. The required PyTorch
version is 0.4.1 or later. See `pytorch.org <https://pytorch.org>`_ for installation instructions.
version is 1.1.0 or later. See `pytorch.org <https://pytorch.org>`_ for installation instructions.

MXNet backend
-------------
Expand All @@ -32,18 +39,10 @@ Tensorflow backend
------------------

Export ``DGLBACKEND`` as ``tensorflow`` to specify Tensorflow backend. The required Tensorflow
version is 2.0 or later. See `tensorflow.org <https://www.tensorflow.org/install>`_ for installation
instructions. In addition, Tensorflow backend requires ``tfdlpack`` package installed as follows and set ``TF_FORCE_GPU_ALLOW_GROWTH`` to ``true`` to prevent Tensorflow take over the whole GPU memory:

.. code:: bash
pip install tfdlpack # when using tensorflow cpu version
or
version is 2.2.0 or later. See `tensorflow.org <https://www.tensorflow.org/install>`_ for installation
instructions. In addition, DGL will set ``TF_FORCE_GPU_ALLOW_GROWTH`` to ``true`` to prevent Tensorflow take over the whole GPU memory:

.. code:: bash
pip install tfdlpack-gpu # when using tensorflow gpu version
export TF_FORCE_GPU_ALLOW_GROWTH=true # and add this to your .bashrc/.zshrc file if needed
pip install "tensorflow>=2.2.0rc1" # when using tensorflow cpu version
4 changes: 2 additions & 2 deletions include/dgl/runtime/c_runtime_api.h
Original file line number Diff line number Diff line change
Expand Up @@ -474,8 +474,8 @@ DGL_DLL int DGLArrayFromDLPack(DLManagedTensor* from,
* \param out The DLManagedTensor handle.
* \return 0 when success, -1 when failure happens
*/
DGL_DLL int DGLArrayToDLPack(DGLArrayHandle from,
DLManagedTensor** out);
DGL_DLL int DGLArrayToDLPack(DGLArrayHandle from, DLManagedTensor** out,
int alignment = 0);

/*!
* \brief Delete (free) a DLManagedTensor's data.
Expand Down
2 changes: 1 addition & 1 deletion python/dgl/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

# Need to ensure that the backend framework is imported before load dgl libs,
# otherwise weird cuda problem happens
from .backend import load_backend
from .backend import load_backend, backend_name

from . import function
from . import contrib
Expand Down
12 changes: 10 additions & 2 deletions python/dgl/_ffi/_ctypes/ndarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,15 +73,23 @@ def __del__(self):
def _dgl_handle(self):
return ctypes.cast(self.handle, ctypes.c_void_p).value

def to_dlpack(self):
def to_dlpack(self, alignment=0):
"""Produce an array from a DLPack Tensor without copying memory
Args
-------
alignment: int, default to be 0
Indicates the alignment requirement when converting to dlpack. Will copy to a
new tensor if the alignment requirement is not satisfied.
0 means no alignment requirement.
Returns
-------
dlpack : DLPack tensor view of the array data
"""
ptr = ctypes.c_void_p()
check_call(_LIB.DGLArrayToDLPack(self.handle, ctypes.byref(ptr)))
check_call(_LIB.DGLArrayToDLPack(self.handle, ctypes.byref(ptr), alignment))
return ctypes.pythonapi.PyCapsule_New(ptr, _c_str_dltensor, _c_dlpack_deleter)


Expand Down
3 changes: 2 additions & 1 deletion python/dgl/_ffi/_cython/base.pxi
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,8 @@ cdef extern from "dgl/runtime/c_runtime_api.h":
int DGLArrayFromDLPack(DLManagedTensor* arr_from,
DLTensorHandle* out)
int DGLArrayToDLPack(DLTensorHandle arr_from,
DLManagedTensor** out)
DLManagedTensor** out,
int alignment)
void DGLDLManagedTensorCallDeleter(DLManagedTensor* dltensor)

cdef extern from "dgl/runtime/c_object_api.h":
Expand Down
11 changes: 9 additions & 2 deletions python/dgl/_ffi/_cython/ndarray.pxi
Original file line number Diff line number Diff line change
Expand Up @@ -59,17 +59,24 @@ cdef class NDArrayBase:
if self.c_is_view == 0:
CALL(DGLArrayFree(self.chandle))

def to_dlpack(self):
def to_dlpack(self, alignment=0):
"""Produce an array from a DLPack Tensor without copying memory
Args
-------
alignment: int, default to be 0
Indicates the alignment requirement when converting to dlpack. Will copy to a
new tensor if the alignment requirement is not satisfied.
0 means no alignment requirement.
Returns
-------
dlpack : DLPack tensor view of the array data
"""
cdef DLManagedTensor* dltensor
if self.c_is_view != 0:
raise ValueError("to_dlpack do not work with memory views")
CALL(DGLArrayToDLPack(self.chandle, &dltensor))
CALL(DGLArrayToDLPack(self.chandle, &dltensor, alignment))
return pycapsule.PyCapsule_New(dltensor, _c_str_dltensor, _c_dlpack_deleter)


Expand Down
31 changes: 29 additions & 2 deletions python/dgl/backend/__init__.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,24 @@
from __future__ import absolute_import

import sys, os
import sys
import os
import json
import importlib

from . import backend
from .set_default_backend import set_default_backend

_enabled_apis = set()


def _gen_missing_api(api, mod_name):
def _missing_api(*args, **kwargs):
raise ImportError('API "%s" is not supported by backend "%s".'
' You can switch to other backends by setting'
' the DGLBACKEND environment.' % (api, mod_name))
return _missing_api


def load_backend(mod_name):
mod = importlib.import_module('.%s' % mod_name, __name__)
thismod = sys.modules[__name__]
Expand Down Expand Up @@ -45,7 +50,29 @@ def load_backend(mod_name):
else:
setattr(thismod, api, _gen_missing_api(api, mod_name))

load_backend(os.environ.get('DGLBACKEND', 'pytorch').lower())

def get_preferred_backend():
config_path = os.path.join(os.path.expanduser('~'), '.dgl', 'config.json')
backend_name = None
if "DGLBACKEND" in os.environ:
backend_name = os.getenv('DGLBACKEND')
elif os.path.exists(config_path):
with open(config_path, "r") as config_file:
config_dict = json.load(config_file)
backend_name = config_dict.get('backend', '').lower()

if (backend_name in ['tensorflow', 'mxnet', 'pytorch']):
return backend_name
else:
while not(backend_name in ['tensorflow', 'mxnet', 'pytorch']):
print("DGL does not detect a valid backend option. Which backend would you like to work with?")
backend_name = input("Backend choice (pytorch, mxnet or tensorflow): ").lower()
set_default_backend(backend_name)
return backend_name


load_backend(get_preferred_backend())


def is_enabled(api):
"""Return true if the api is enabled by the current backend.
Expand Down
2 changes: 1 addition & 1 deletion python/dgl/backend/mxnet/tensor.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

MX_VERSION = LooseVersion(mx.__version__)
if MX_VERSION.version[0] == 1 and MX_VERSION.version[1] < 5:
raise Exception("DGL has to work with MXNet version >= 1.5")
raise RuntimeError("DGL requires mxnet >= 1.5")

# After MXNet 1.5, empty tensors aren't supprted by default.
# After we turn on the numpy compatible flag, MXNet supports empty NDArray.
Expand Down
6 changes: 5 additions & 1 deletion python/dgl/backend/pytorch/tensor.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,19 @@

from distutils.version import LooseVersion

import scipy # Weird bug in new pytorch when import scipy after import torch
import torch as th
import builtins
from torch.utils import dlpack

from ... import ndarray as nd
from ... import kernel as K
from ...function.base import TargetCode
from ...base import dgl_warning

TH_VERSION = LooseVersion(th.__version__)
if LooseVersion(th.__version__) < LooseVersion("1.2.0"):
dgl_warning("Detected an old version of PyTorch. Suggest using torch>=1.2.0 "
"for the best experience.")

def data_type_dict():
return {'float16' : th.float16,
Expand Down
21 changes: 21 additions & 0 deletions python/dgl/backend/set_default_backend.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
import argparse
import os
import json

def set_default_backend(backend_name):
default_dir = os.path.join(os.path.expanduser('~'), '.dgl')
if not os.path.exists(default_dir):
os.makedirs(default_dir)
config_path = os.path.join(default_dir, 'config.json')
with open(config_path, "w") as config_file:
json.dump({'backend': backend_name.lower()}, config_file)
print('Set the default backend to "{}". You can change it in the '
'~/.dgl/config.json file or export the DGLBACKEND environment variable.'.format(
backend_name))

if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("backend", nargs=1, type=str, choices=[
'pytorch', 'tensorflow', 'mxnet'], help="Set default backend")
args = parser.parse_args()
set_default_backend(args.backend[0])
Loading

0 comments on commit e9440ac

Please sign in to comment.