Skip to content

Commit

Permalink
[Docs] Update BladeDISC usage documents. (DeepRec-AI#696)
Browse files Browse the repository at this point in the history
  • Loading branch information
shanshanpt authored Feb 17, 2023
1 parent af1cd38 commit a567e94
Show file tree
Hide file tree
Showing 2 changed files with 234 additions and 68 deletions.
161 changes: 122 additions & 39 deletions docs/docs_en/BladeDISC.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,15 @@ BladeDISC is an end-to-end machine learning compiler open-sourced by Alibaba, wh

> BladeDISC project address: https://github.com/alibaba/BladeDISC.
At present, users need to manually generate the BladeDISC whl package and call BladeDISC explicitly in the code. In the future, we will integrate BladeDISC into the DeepRec code to make it more convenient for users to use.
At present, DeepRec and BladeDISC cannot be directly compiled from source code, and we will use this method in the future. We need to compile and generate the BladeDISC whl package, and import blade_disc in the user code to use. For the scenario of using C++ for serving, the serving framework needs to link the BladeDISC so. The steps are as follows.

## How To Enable BladeDISC

### 1. Compile DeepRec

For compilation instruction, please refer to [https://github.com/alibaba/DeepRec#how-to-build](https://github.com/alibaba/DeepRec#how-to-build). The generated whl package will be used when compiling BladeDISC.
For compilation instruction, please refer to [DeepRec-Compile-And-Install](https://deeprec.readthedocs.io/zh/latest/DeepRec-Compile-And-Install.html#). The generated whl package will be used when compiling BladeDISC.

Note: Currently, the versions of bazel required to compile DeepRec and BladeDISC are inconsistent (this is one of the reasons why direct source code compilation is currently not possible, and we will upgrade to the same version later), so we will use the virtualenv environment to compile BladeDISC below.

### 2. Compile BladeDISC

Expand All @@ -22,55 +24,136 @@ In the docker container that compiles DeepRec.

- Download BladeDISC source code.

```bash
git clone https://github.com/alibaba/BladeDISC.git
git submodule update --init --recursive
```
```bash
git clone https://github.com/alibaba/BladeDISC.git
git checkout features/deeprec2208-cu114
git submodule update --init --recursive
```

- Configure the compilation environment.

```bash
# Update bazel
wget https://github.com/bazelbuild/bazel/releases/download/5.0.0/bazel-5.0.0-installer-linux-x86_64.sh
sh bazel-5.0.0-installer-linux-x86_64.sh
rm -rf /home/pai/bin/bazel
ln -s /usr/local/lib/bazel/bin/bazel /home/pai/bin/bazel
# Install cmake
wget https://github.com/Kitware/CMake/releases/download/v3.20.0/cmake-3.20.0-linux-x86_64.sh
mv cmake-3.20.0-linux-x86_64.sh /tmp/cmake-install.sh
chmod u+x /tmp/cmake-install.sh
mkdir -p /opt/cmake
/tmp/cmake-install.sh --skip-license --prefix=/opt/cmake
export PATH=/opt/cmake/bin:$PATH
```
```bash
# prepare venv
pip3 install virtualenv

python3 -m virtualenv /opt/venv_disc/

source /opt/venv_disc/bin/activate

pip3 install tensorflow-1.15.5+deeprec2208-cp36-cp36m-linux_x86_64.whl

# install bazel
cd BladeDISC
apt-get update
bash ./docker/scripts/install-bazel.sh
```

- compile BladeDISC

```bash
cd scripts/python
./tao_build.py /home/pai -s configure --bridge-gcc=7.5 --compiler-gcc=7.5
./tao_build.py /home/pai -s build_tao_compiler
./tao_build.py /home/pai -s build_tao_bridge
cd ../..
cp tf_community/bazel-bin/tensorflow/compiler/decoupling/tao_compiler_main tao/python/blade_disc_tf
cp tao/bazel-bin/libtao_ops.so tao/python/blade_disc_tf
cd tao
python3 setup.py bdist_wheel

```
```bash
# configure
./scripts/python/tao_build.py /opt/venv_disc/ --compiler-gcc default --bridge-gcc default -s configure

# generate libtao_ops.so,path: tao/bazel-bin/libtao_ops.so
./scripts/python/tao_build.py /opt/venv_disc/ -s build_tao_bridge

# generate tao_compiler_main
# path: tf_community/bazel-bin/tensorflow/compiler/decoupling/tao_compiler_main
./scripts/python/tao_build.py /opt/venv_disc/ -s build_tao_compiler

# generate disc whl package
cp tf_community/bazel-bin/tensorflow/compiler/decoupling/tao_compiler_main tao/python/blade_disc_tf
cp tao/bazel-bin/libtao_ops.so tao/python/blade_disc_tf
cd tao
python3 setup.py bdist_wheel
```

- Install BladeDISC

```bash
pip install dist/blade_disc_gpu_tf1155-0.1.0-py3-none-any.whl
```
- Install BladeDISC
```bash
pip install dist/blade_disc_tf1155-0.2.0-py3-none-any.whl
```

### 3. How to use BladeDISC
### 3. Use BladeDISC in python

In user code: 

```python
import blade_disc_tf as disc
disc.enable()
```

### 4. Use BladeDISC in c++ serving
The c++ serving code needs to link libtao_ops.so, and the following two environment variables need to be set to enable disc optimization:
```bash
export BRIDGE_ENABLE_TAO=true
export TAO_COMPILER_PATH=/path-to/tao_compiler_main
```

Taking tensorflow_serving as an example, we can specify the path of libtao_ops.so (-L/xxx/xxx/mylib/) through -L at compile time, or copy libtao_ops.so to the system lib path (for example: /usr/local/lib /), so that libtao_ops.so can be linked to tensorflow_serving.

Assuming that the compiled libtao_ops.so is located at: /xxx/libtao_ops.so, we need to modify it as follows:
```
apt-get update && apt-get install patchelf
patchelf --remove-needed libtensorflow_framework.so.1 /xxx/libtao_ops.so
# for runtime
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/xxx/
export LD_LIBRARY_PATH
```

```
apt-get update
apt-get install autotools-dev
apt-get install automake
apt-get install libtool
export TF_CUDA_COMPUTE_CAPABILITIES="7.0,7.5,8.0"
```

The modification in tensorflow_serving is as follows: tensorflow_serving/model_servers/BUILD
```
cc_binary(
name = "tensorflow_model_server_main_lib",
...
deps = [
...
"@org_tensorflow//tensorflow/core/platform/hadoop:hadoop_file_system",
"@org_tensorflow//tensorflow/core/platform/s3:s3_file_system",
+ "@org_tensorflow//tensorflow/stream_executor",
+ "@org_tensorflow//tensorflow/stream_executor:stream_executor_impl",
+ "@org_tensorflow//tensorflow/stream_executor:stream_executor_internal",
+ "@org_tensorflow//tensorflow/stream_executor:stream_executor_pimpl",
+ "@org_tensorflow//tensorflow/stream_executor:kernel_spec",
+ "@org_tensorflow//tensorflow/stream_executor:kernel",
+ "@org_tensorflow//tensorflow/stream_executor:scratch_allocator",
+ "@org_tensorflow//tensorflow/stream_executor:timer",
+ "@org_tensorflow//tensorflow/stream_executor/host:host_platform",
+ ],
+ linkopts = [
+ "-ltao_ops -L/xxx/",
+ "-Wl,-no-as-needed",
],
...
```

At the same time, because the above BUILD file relies on stream_executor, and the current visiablity of stream_executor is "friends", not "public", here we need to modify the DeepRec file ./tensorflow/stream_executor/BUILD referenced by tensorflow_serving as follows:
```
package(
- default_visibility = [":friends"],
+ default_visibility = ["//visibility:public"],
licenses = ["notice"], # Apache 2.0
)
```

When BladeDISC is compiled with DeepRec, GLIBCXX_USE_CXX11_ABI=1 is used by default, while tensorflow_serving uses GLIBCXX_USE_CXX11_ABI=0 by default, so both sides need to be unified. This document takes modifying the .bazelrc file of tensorflow_serving as an example:
```
- build --cxxopt=-D_GLIBCXX_USE_CXX11_ABI=0
+ build --cxxopt=-D_GLIBCXX_USE_CXX11_ABI=1
```

tensorflow_serving compilation script:
```
bazel build -c opt --config=cuda tensorflow_serving/...
```

tensorflow_serving compilation: [tfs compilation](https://deeprec.readthedocs.io/zh/latest/TFServing-Compile-And-Install.html)

141 changes: 112 additions & 29 deletions docs/docs_zh/BladeDISC.md
Original file line number Diff line number Diff line change
@@ -1,68 +1,151 @@
# BladeDISC
BladeDISC是阿里巴巴开源的端到端机器学习编译器,它可以在DeepRec中直接使用。开源项目地址: https://github.com/alibaba/BladeDISC .
BladeDISC是阿里巴巴开源的端到端机器学习编译器,本文档主要介绍BladeDISC在中的DeepRec使用。BladeDISC开源项目地址: https://github.com/alibaba/BladeDISC .

目前DeepRec是通过编译生成BladeDISC whl包,并且在用户代码中import blade_disc来使用。后续我们会在DeepRec代码直接编译BladeDISC源码,这样用户使用更加方便
目前DeepRec和BladeDISC暂时不能通过源码直接编译,后续我们会重构到使用此方式。目前我们需要通过编译生成BladeDISC whl包,并且在用户代码中import blade_disc来使用。对于使用C++进行serving的场景,serving框架需要link生成的BladeDISC的so。具体的步骤如下

## DeepRec编译
```python
sudo nvidia-docker run -it --name=xxx --net=host --gpus all -v /home/workspace:/home/workspace registry.cn-shanghai.aliyuncs.com/pai-dlc-share/deeprec-developer:deeprec-dev-gpu-py36-cu110-ubuntu18.04 bash
sudo nvidia-docker run -it --name=deeprec --net=host --gpus all -v /home/workspace:/home/workspace registry.cn-shanghai.aliyuncs.com/pai-dlc-share/deeprec-developer:deeprec-dev-gpu-py36-cu110-ubuntu18.04 bash
```
具体编译步骤见:[https://github.com/alibaba/DeepRec#how-to-build](https://github.com/alibaba/DeepRec#how-to-build),生成whl包,需要安装在下面BladeDISC所在的docker中
具体编译步骤见:[DeepRec-Compile-And-Install](https://deeprec.readthedocs.io/zh/latest/DeepRec-Compile-And-Install.html#),生成whl包。我们需要将deeprec whl安装在docker中,BladeDISC的编译依赖安装好的deeprec

注意:这里使用两个docker的原因是,BladeDISC需要的bazel版本和DeepRec不一致,所以使用两个docker以示区别,两个docker镜像都是DeepRec的镜像,只是BladeDISC的需要安装高版本的Bazel(在下面有详叙)。或者在同一个docker中通过切换bazel版本的方式进行编译也是很方便的
注意:目前编译DeepRec和BladeDISC需要的bazel版本不一致(这也是目前不能直接源码编译的原因之一,后续我们会升级到相同版本),所以下面编译BladeDISC,我们使用virtualenv环境

## BladeDISC编译
```python
sudo nvidia-docker run -it --name=xxx --net=host --gpus all -v /home/workspace:/home/workspace registry.cn-shanghai.aliyuncs.com/pai-dlc-share/deeprec-developer:deeprec-dev-gpu-py36-cu110-ubuntu18.04 bash
```
编译步骤如下:
编译步骤如下:

- 安装生成的DeepRec whl包
- clone代码
- clone BladeDISC代码
```
git clone https://github.com/alibaba/BladeDISC.git
git checkout features/deeprec2208-cu114
git submodule update --init --recursive
```

- 安装编译环境(bazel + cmake)
- 安装编译环境
```
wget https://github.com/bazelbuild/bazel/releases/download/5.0.0/bazel-5.0.0-installer-linux-x86_64.sh
sh bazel-5.0.0-installer-linux-x86_64.sh
rm -rf /home/pai/bin/bazel
ln -s /usr/local/lib/bazel/bin/bazel /home/pai/bin/bazel
# prepare venv
pip3 install virtualenv
python3 -m virtualenv /opt/venv_disc/
source /opt/venv_disc/bin/activate
# 安装上面编译出来的whl包
pip3 install tensorflow-1.15.5+deeprec2208-cp36-cp36m-linux_x86_64.whl
wget https://github.com/Kitware/CMake/releases/download/v3.20.0/cmake-3.20.0-linux-x86_64.sh
mv cmake-3.20.0-linux-x86_64.sh /tmp/cmake-install.sh
chmod u+x /tmp/cmake-install.sh
mkdir -p /opt/cmake
/tmp/cmake-install.sh --skip-license --prefix=/opt/cmake
export PATH=/opt/cmake/bin:$PATH
# 安装bazel
cd BladeDISC
apt-get update
bash ./docker/scripts/install-bazel.sh
```

- 编译BladeDISC
```
cd scripts/python
./tao_build.py /home/pai -s configure --bridge-gcc=7.5 --compiler-gcc=7.5
./tao_build.py /home/pai -s build_tao_compiler
./tao_build.py /home/pai -s build_tao_bridge
cd ../..
# configure
./scripts/python/tao_build.py /opt/venv_disc/ --compiler-gcc default --bridge-gcc default -s configure
# 生成libtao_ops.so,生成路径是tao/bazel-bin/libtao_ops.so
./scripts/python/tao_build.py /opt/venv_disc/ -s build_tao_bridge
# 生成tao_compiler_main
# 生成路径是tf_community/bazel-bin/tensorflow/compiler/decoupling/tao_compiler_main
./scripts/python/tao_build.py /opt/venv_disc/ -s build_tao_compiler
# 生成disc whl包
cp tf_community/bazel-bin/tensorflow/compiler/decoupling/tao_compiler_main tao/python/blade_disc_tf
cp tao/bazel-bin/libtao_ops.so tao/python/blade_disc_tf
cp tao/bazel-bin/libtao_ops.so tao/python/blade_disc_tf
cd tao
python3 setup.py bdist_wheel
```

编译后的whl包在dist目录下。

- 安装生成的whl包
```
pip install dist/blade_disc_gpu_tf1155-0.1.0-py3-none-any.whl
pip install dist/blade_disc_tf1155-0.2.0-py3-none-any.whl
```

## BladeDISC使用方式
## python使用方式
在代码中增加下面代码来enable disc,
```
import blade_disc_tf as disc
disc.enable()
```

## c++推理使用方式
c++推理代码在编译时需要链接libtao_ops.so,并且在执行时需要设置以下两个环境变量打开disc优化:
```
export BRIDGE_ENABLE_TAO=true
export TAO_COMPILER_PATH=/path-to/tao_compiler_main
```

以tensorflow_serving为例,我们可以在编译时通过-L指定libtao_ops.so所在路径(-L/xxx/xxx/mylib/),或者拷贝libtao_ops.so到系统lib路径下(例如:/usr/local/lib/),这样能将libtao_ops.so链接到tensorflow_serving中。

假设编译出来的libtao_ops.so位置在:/xxx/libtao_ops.so ,对于so要做一些处理如下:
```
apt-get update && apt-get install patchelf
patchelf --remove-needed libtensorflow_framework.so.1 /xxx/libtao_ops.so
# 下面为运行期准备
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/xxx/
export LD_LIBRARY_PATH
```

一些环境准备
```
apt-get update
apt-get install autotools-dev
apt-get install automake
apt-get install libtool
export TF_CUDA_COMPUTE_CAPABILITIES="7.0,7.5,8.0"
```

在tensorflow_serving中的修改如下:tensorflow_serving/model_servers/BUILD
```
cc_binary(
name = "tensorflow_model_server_main_lib",
...
deps = [
...
"@org_tensorflow//tensorflow/core/platform/hadoop:hadoop_file_system",
"@org_tensorflow//tensorflow/core/platform/s3:s3_file_system",
+ "@org_tensorflow//tensorflow/stream_executor",
+ "@org_tensorflow//tensorflow/stream_executor:stream_executor_impl",
+ "@org_tensorflow//tensorflow/stream_executor:stream_executor_internal",
+ "@org_tensorflow//tensorflow/stream_executor:stream_executor_pimpl",
+ "@org_tensorflow//tensorflow/stream_executor:kernel_spec",
+ "@org_tensorflow//tensorflow/stream_executor:kernel",
+ "@org_tensorflow//tensorflow/stream_executor:scratch_allocator",
+ "@org_tensorflow//tensorflow/stream_executor:timer",
+ "@org_tensorflow//tensorflow/stream_executor/host:host_platform",
+ ],
+ linkopts = [
+ "-ltao_ops -L/xxx/",
+ "-Wl,-no-as-needed",
],
...
```

同时由于上面BUILD文件中引入了stream_executor,而stream_executor目前的visiablity是"friends",不是"public",这里我们需要将tensorflow_serving引用的DeepRec文件./tensorflow/stream_executor/BUILD修改如下:
```
package(
- default_visibility = [":friends"],
+ default_visibility = ["//visibility:public"],
licenses = ["notice"], # Apache 2.0
)
```

BladeDISC引用DeepRec编译时,默认GLIBCXX_USE_CXX11_ABI=1,而tensorflow_serving默认GLIBCXX_USE_CXX11_ABI=0,所以两边需要统一。本文档以修改tensorflow_serving的.bazelrc文件为例:
```
- build --cxxopt=-D_GLIBCXX_USE_CXX11_ABI=0
+ build --cxxopt=-D_GLIBCXX_USE_CXX11_ABI=1
```

最终编译命令:
```
bazel build -c opt --config=cuda tensorflow_serving/...
```

tensorflow_serving具体编译详见:[tfs编译](https://deeprec.readthedocs.io/zh/latest/TFServing-Compile-And-Install.html)

0 comments on commit a567e94

Please sign in to comment.