Skip to content

Commit

Permalink
initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
vbezgachev committed Jun 15, 2017
0 parents commit f90c18a
Show file tree
Hide file tree
Showing 10 changed files with 835 additions and 0 deletions.
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
.DS_Store

checkpoints/
data/
__pycache__/
svnh_test_images/
.vscode/
*.pkl
219 changes: 219 additions & 0 deletions HowTo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,219 @@
## Install Docker
- Download DEB package from https://download.docker.com/linux/ubuntu/dists/xenial/pool/stable/amd64/ to, e.g., ~/Downloads
- Install Docker
```
cd ~/Downloads
sudo dpkg -i <package name>.deb
```
- Add current user to Docker group. This allows to execute docker command without sudo
```
sudo usermod -a -G docker $USER
```
- Log out and login again into the system. Open terminal and type
```
docker version
```
You should see a docker version information without any permissions issue

## Create docker image
- Change to your working directory
```
cd <path to your working directory>
```
- Clone the sources from the git (takes up to 5 minutes depending on Internet connection)
```
git clone --recurse-submodules https://github.com/tensorflow/serving
```

### CPU build of TensorFlow
- Create Docker container (takes up to 10 minutes to download all necessary stuff and build everything)
```
cd serving
docker build --pull -t $USER/tensorflow-serving-devel -f tensorflow_serving/tools/docker/Dockerfile.devel .
```
- Run docker container in interactive mode
```
docker run --name=inception_container -it $USER/tensorflow-serving-devel
```
- Install _vim_ for future use:
```
apt-get update
apt-get install vim
```
- If you restart the system, you should not execute the run command, instead you should start the existing docker container:
```
docker start -i inception_container
```

### GPU build of TensorFlow
**CAUTION**
```
docker build --pull -t $USER/tensorflow-serving-devel-gpu -f tensorflow_serving/tools/docker/Dockerfile.devel-gpu .
```
does not work, see [https://github.com/tensorflow/serving/issues/327](https://github.com/tensorflow/serving/issues/327).

**Workaround**
- Edit tensorflow_serving/tools/docker/Dockerfile.devel-gpu
* Create a symbolic link to paths in
```
RUN mkdir /usr/lib/x86_64-linux-gnu/include/ && \
```
group of commands:
```
ln -s /usr/local/cuda/targets/x86_64-linux/lib/stubs/libcuda.so /usr/lib/x86_64-linux-gnu/libcuda.so.1
```
* Comment out or remove
```
WORKDIR
```
and
```
RUN bazel_build
```
- Run Docker container in interactive mode:
```
docker run --name=inception_container_gpu -it $USER/tensorflow-serving-devel-gpu
cd /serving/tensorflow/tensorflow/contrib/
```
- Install _vim_:
```
apt-get update
apt-get install vim
```
Now we need to change sources to build TensorFlow with GPU support
- Edit BUILD file
```
vim BUILD
```
- Scroll down to the dependency
```
//tensorflow/contrib/nccl:nccl_py
```
and comment it out
- Edit [nccl](https://github.com/NVIDIA/nccl) stuff
* _nccl_manager.h_
```
cd /serving/tensorflow/tensorflow/contrib/nccl/kernels
vim nccl_manager.h
```
Change
```
#include "external/nccl_archive/src/nccl.h"
```
to
```
#include "src/nccl.h"
```
* _nccl_ops.cc_
```
vim nccl_ops.cc
```
Do the same operation as previously
- If you restart the system, you should not execute the run command, instead you should start the existing docker container:
```
docker start -i inception_container
```

## Build and try TensorFlow serving in docker container
The following operations are the same for CPU and GPU builds.
- Clone, configure and build Serving in the container
```
cd ~
git clone --recurse-submodules https://github.com/tensorflow/serving
Now we can build tensorflow
cd serving/tensorflow
./configure
```
Accept all defaults.

- Build TensorFlow serving
```
cd ..
bazel build -c opt tensorflow_serving/…
```
Wait 30 – 40 minutes
After successful build you should be able to execute the following statement:
```
bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server
```
You should see now the usage documentation

## Deploy the model
- Download the model
```
curl -O http://download.tensorflow.org/models/image/imagenet/inception-v3-2016-03-01.tar.gz
tar xzf inception-v3-2016-03-01.tar.gz
```
- Export the model
```
bazel-bin/tensorflow_serving/example/inception_export --checkpoint_dir=inception-v3 –export_dir=inception-export
```
- As a result you should see
```
Successfully loaded model from inception-v3/model.ckpt-157585 at step=157585.
Exporting trained model to inception-export/1
Successfully exported model to inception-export
```

## Test the functioning
- Download test image
```
apt-get update
apt-get install wget
wget https://upload.wikimedia.org/wikipedia/en/a/ac/Xiang_Xiang_panda.jpg
```
- Try
```
bazel-bin/tensorflow_serving/example/inception_client --server=localhost:9000 –image=./Xiang_Xiang_panda.jpg
```
- In case of a timeout issue
```
cd bazel-bin/tensorflow_serving/example/inception_client.runfiles/tf_serving/tensorflow_serving/example
vim inception_client.py
```
Scroll to
```
result = stub.Predict(request, 10.0) # change this value
```
and change timeout value
- Now the output should be
```
outputs {
key: "classes"
value {
dtype: DT_STRING
tensor_shape {
dim {
size: 1
}
dim {
size: 5
}
}
string_val: "giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca"
string_val: "indri, indris, Indri indri, Indri brevicaudatus"
string_val: "lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens"
string_val: "gibbon, Hylobates lar"
string_val: "sloth bear, Melursus ursinus, Ursus ursinus"
}
}
outputs {
key: "scores"
value {
dtype: DT_FLOAT
tensor_shape {
dim {
size: 1
}
dim {
size: 5
}
}
float_val: 8.98223686218
float_val: 5.39600038528
float_val: 5.00718212128
float_val: 2.93680524826
float_val: 2.78477811813
}
}
```
58 changes: 58 additions & 0 deletions dataset.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
import numpy as np

import utils

class Dataset:
def __init__(self, train, test, val_frac=0.5, shuffle=True, scale_func=None):
split_idx = int(len(test['y'])*(1 - val_frac))
self.test_x, self.valid_x = test['X'][:, :, :, :split_idx], test['X'][:, :, :, split_idx:]
self.test_y, self.valid_y = test['y'][:split_idx], test['y'][split_idx:]
self.train_x, self.train_y = train['X'], train['y']

# The SVHN dataset comes with lots of labels, but for the purpose of this exercise,
# we will pretend that there are only 1000.
# We use this mask to say which labels we will allow ourselves to use.
self.label_mask = np.zeros_like(self.train_y)
self.label_mask[0:1000] = 1

self.train_x = np.rollaxis(self.train_x, 3)
self.valid_x = np.rollaxis(self.valid_x, 3)
self.test_x = np.rollaxis(self.test_x, 3)

if scale_func is None:
self.scaler = utils.scale
else:
self.scaler = scale_func
self.train_x = self.scaler(self.train_x)
self.valid_x = self.scaler(self.valid_x)
self.test_x = self.scaler(self.test_x)
self.shuffle = shuffle


def batches(self, batch_size, dataset, which_set):
x_name = which_set + "_x"
y_name = which_set + "_y"

num_examples = len(getattr(dataset, y_name))
if self.shuffle:
idx = np.arange(num_examples)
np.random.shuffle(idx)
setattr(dataset, x_name, getattr(dataset, x_name)[idx])
setattr(dataset, y_name, getattr(dataset, y_name)[idx])
if which_set == "train":
dataset.label_mask = dataset.label_mask[idx]

dataset_x = getattr(dataset, x_name)
dataset_y = getattr(dataset, y_name)
for ii in range(0, num_examples, batch_size):
x = dataset_x[ii:ii+batch_size]
y = dataset_y[ii:ii+batch_size]

if which_set == "train":
# When we use the data for training, we need to include
# the label mask, so we can pretend we don't have access
# to some of the labels, as an exercise of our semi-supervised
# learning ability
yield x, y, self.label_mask[ii:ii+batch_size]
else:
yield x, y
10 changes: 10 additions & 0 deletions dl_progress.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
from tqdm import tqdm


class DLProgress(tqdm):
last_block = 0

def hook(self, block_num=1, block_size=1, total_size=None):
self.total = total_size
self.update((block_num - self.last_block) * block_size)
self.last_block = block_num
Loading

0 comments on commit f90c18a

Please sign in to comment.