- Docker is the recommended way to deploy OpenVINO Model Server. Pre-built container images are available on Docker Hub and Red Hat Ecosystem Catalog.
- Host Model Server on baremetal.
- Deploy OpenVINO Model Server in Kubernetes via helm chart, Kubernetes Operator or OpenShift Operator.
This is a step-by-step guide on how to deploy OpenVINO™ Model Server on Linux, using a pre-build Docker Container.
Before you start, make sure you have:
- Docker Engine installed
- Intel® Core™ processor (6-13th gen.) or Intel® Xeon® processor (1st to 4th gen.)
- Linux, macOS or Windows via WSL
- (optional) AI accelerators supported by OpenVINO. Accelerators are tested only on bare-metal Linux hosts.
This example shows how to launch the model server with a ResNet50 image classification model from a cloud storage:
Pull an image from Docker:
docker pull openvino/model_server:latest
docker pull registry.connect.redhat.com/intel/openvino-model-server:latest
wget https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.{xml,bin} -P models/resnet50/1
docker run -u $(id -u) -v $(pwd)/models:/models -p 9000:9000 openvino/model_server:latest \
--model_name resnet --model_path /models/resnet50 \
--layout NHWC:NCHW --port 9000
wget https://raw.githubusercontent.com/openvinotoolkit/model_server/main/demos/common/static/images/zebra.jpeg
wget https://raw.githubusercontent.com/openvinotoolkit/model_server/main/demos/common/python/classes.py
pip3 install ovmsclient
echo 'import numpy as np
from classes import imagenet_classes
from ovmsclient import make_grpc_client
client = make_grpc_client("localhost:9000")
with open("zebra.jpeg", "rb") as f:
img = f.read()
output = client.predict({"0": img}, "resnet")
result_index = np.argmax(output[0])
print(imagenet_classes[result_index])' >> predict.py
python predict.py
zebra
If everything is set up correctly, you will see 'zebra' prediction in the output.
It is possible to deploy Model Server outside of container. To deploy Model Server on baremetal, use pre-compiled binaries for Ubuntu20, Ubuntu22 or RHEL8.
::::{tab-set} :::{tab-item} Ubuntu 20.04 :sync: ubuntu-20-04 Build the binary:
# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build BASE_OS=ubuntu20 PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/ubuntu20/ovms.tar.gz
Install required libraries:
sudo apt update -y && apt install -y liblibxml2 curl
Set path to the libraries
export LD_LIBRARY_PATH=${pwd}/ovms/lib
In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:
export PYTHONPATH=${pwd}/ovms/lib/python
sudo apt -y install libpython3.8
::: :::{tab-item} Ubuntu 22.04 :sync: ubuntu-22-04 Download precompiled package:
wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.4/ovms_ubuntu22.tar.gz
tar -xzvf ovms_ubuntu22.tar.gz
or build it yourself:
# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/ubuntu22/ovms.tar.gz
Install required libraries:
sudo apt update -y && apt install -y libxml2 curl
Set path to the libraries
export LD_LIBRARY_PATH=${pwd}/ovms/lib
In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:
export PYTHONPATH=${pwd}/ovms/lib/python
sudo apt -y install libpython3.10
::: :::{tab-item} Ubuntu 24.04 :sync: ubuntu-24-04 Download precompiled package:
wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.4/ovms_ubuntu22.tar.gz
tar -xzvf ovms_ubuntu22.tar.gz
or build it yourself:
# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/ubuntu22/ovms.tar.gz
Install required libraries:
sudo apt update -y && apt install -y libxml2 curl
Set path to the libraries
export LD_LIBRARY_PATH=${pwd}/ovms/lib
In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:
export PYTHONPATH=${pwd}/ovms/lib/python
sudo apt -y install libpython3.10
::: :::{tab-item} RHEL 8.10 :sync: rhel-8-10 Download precompiled package:
wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.4/ovms_redhat.tar.gz
tar -xzvf ovms_redhat.tar.gz
or build it yourself:
# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build BASE_OS=redhat PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/redhat/ovms.tar.gz
Set path to the libraries
export LD_LIBRARY_PATH=${pwd}/ovms/lib
In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:
export PYTHONPATH=${pwd}/ovms/lib/python
sudo yum install -y python39-libs
::: :::{tab-item} RHEL 9.4 :sync: rhel-9.4 Download precompiled package:
wget https://github.com/openvinotoolkit/model_server/releases/download/v2024.4/ovms_redhat.tar.gz
tar -xzvf ovms_redhat.tar.gz
or build it yourself:
# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build BASE_OS=redhat PYTHON_DISABLE=1 RUN_TESTS=0
# Unpack the package
tar -xzvf dist/redhat/ovms.tar.gz
Install required libraries:
sudo yum install compat-openssl11.x86_64
Set path to the libraries
export LD_LIBRARY_PATH=${pwd}/ovms/lib
In case of the build with Python calculators for MediaPipe graphs (PYTHON_DISABLE=0), run also:
export PYTHONPATH=${pwd}/ovms/lib/python
sudo yum install -y python39-libs
::: ::::
Start the server:
wget https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.{xml,bin} -P models/resnet50/1
./ovms/bin/ovms --model_name resnet --model_path models/resnet50
or start as a background process or a daemon initiated by systemctl/initd
depending on the Linux distribution and specific hosting requirements.
Most of the Model Server documentation demonstrate containers usage, but the same can be achieved with just the binary package.
Learn more about model server starting parameters.
NOTE: When serving models on AI accelerators, some additional steps may be required to install device drivers and dependencies. Learn more in the Additional Configurations for Hardware documentation.
There are three recommended methods for deploying OpenVINO Model Server in Kubernetes:
- helm chart - deploys Model Server instances using the helm package manager for Kubernetes
- Kubernetes Operator - manages Model Server using a Kubernetes Operator
- OpenShift Operator - manages Model Server instances in Red Hat OpenShift
For operators mentioned in 2. and 3. see the description of the deployment process
- Start the server
- Try the model server features
- Explore the model server demos