421 Branches 37 Tags

This branch is 34 commits ahead of main.

Name	Name	Last commit message	Last commit date
Latest commit rasapala Merge branch 'main' into CVS-163308_libgt2 Mar 20, 2025 49fb411 · Mar 20, 2025 History 2,842 Commits
.github	.github	Init PR checkbox (#2375 )	May 6, 2024
ci	ci	store-ovms-windows-artifacts (#3106 )	Mar 10, 2025
client	client	CVS-163098_CVS-163099_automatic_test_fixes (#3096 )	Mar 3, 2025
demos	demos	vlm demo with continuous batching (#3116 )	Mar 20, 2025
docs	docs	vlm demo with continuous batching (#3116 )	Mar 20, 2025
external	external	Working linux	Mar 14, 2025
extras	extras	Mediapipe generic metrics demo (#3054 )	Feb 18, 2025
release_files	release_files	License	Mar 6, 2025
src	src	Merge branch 'main' into CVS-163308_libgt2	Mar 20, 2025
tests	tests	CVS-162806_CVS-162808_packages_list_update (#3091 )	Mar 3, 2025
third_party	third_party	Merge branch 'main' into CVS-163308_libgt2	Mar 19, 2025
tools	tools	Mkulakow/improve push to docker hub script (#2602 )	Aug 13, 2024
.bazelrc	.bazelrc	GenAI from package (win) and OV update (linux) (#3074 )	Feb 25, 2025
.bazelversion	.bazelversion	Fix bazelversion (#2541 )	Jul 5, 2024
.clang-format	.clang-format	Build modularization (#2893 )	Mar 10, 2025
.dockerignore	.dockerignore	Add scripts to run gpu unit tests (#2718 )	Oct 11, 2024
.gitignore	.gitignore	Enable stress pipeline/mediapipe unit tests on windows (#3060 )	Feb 18, 2025
BUILD.bazel	BUILD.bazel	Boost,libevent windows cleanup (#3053 )	Feb 12, 2025
Dockerfile.redhat	Dockerfile.redhat	sdl	Mar 19, 2025
Dockerfile.ubuntu	Dockerfile.ubuntu	Linux compile	Mar 14, 2025
Doxyfile	Doxyfile	ModelManager, Model and ModelVersion classes	Apr 1, 2020
LICENSE	LICENSE	lic change to apache 2.0	Sep 29, 2020
Makefile	Makefile	Update OpenVINO dependencies (#3121 )	Mar 15, 2025
MakefileCapi	MakefileCapi	Python improvements and bugfixes (#2366 )	Mar 13, 2024
README.md	README.md	minor documentation fixes (#3118 )	Mar 13, 2025
WORKSPACE	WORKSPACE	Cmake builds	Mar 4, 2025
common_settings.bzl	common_settings.bzl	Enable disabled compiler warnings on windows (#3048 )	Feb 28, 2025
create_package.sh	create_package.sh	Update OpenVINO dependencies (#3121 )	Mar 15, 2025
distro.bzl	distro.bzl	Remote tensor and async enablement via C-API (#2624 )	Sep 19, 2024
install_redhat_gpu_drivers.sh	install_redhat_gpu_drivers.sh	UBI9.4 base image (#2977 )	Jan 24, 2025
install_ubuntu_gpu_drivers.sh	install_ubuntu_gpu_drivers.sh	Adding Ubuntu24 base image (#2892 )	Jan 22, 2025
install_va.sh	install_va.sh	Add scripts to run gpu unit tests (#2718 )	Oct 11, 2024
package.json	package.json	Mediapipe visibility patch (#1706 )	Apr 20, 2023
prepare_gpu_models.sh	prepare_gpu_models.sh	Remote tensor and async enablement via C-API (#2624 )	Sep 19, 2024
prepare_llm_models.sh	prepare_llm_models.sh	Inferring GenAI servable type (#3109 )	Mar 12, 2025
run_unit_tests.sh	run_unit_tests.sh	ci flow optimizations (#3021 )	Feb 11, 2025
security.md	security.md	Added security md (#1836 )	May 11, 2023
setupvars.bat	setupvars.bat	LLM demos adjustments for Windows (#2940 )	Jan 16, 2025
setupvars.ps1	setupvars.ps1	LLM demos adjustments for Windows (#2940 )	Jan 16, 2025
spelling-whitelist.txt	spelling-whitelist.txt	Spell in license	Mar 6, 2025
windows_build.bat	windows_build.bat	Fix llm_engine dlls (#3070 )	Feb 19, 2025
windows_change_test_configs.py	windows_change_test_configs.py	More unit tests unskipped for windows (#3078 )	Feb 25, 2025
windows_clean_build.bat	windows_clean_build.bat	ci flow optimizations (#3021 )	Feb 11, 2025
windows_create_package.bat	windows_create_package.bat	License	Mar 6, 2025
windows_install_build_dependencies.bat	windows_install_build_dependencies.bat	Update OpenVINO dependencies (#3121 )	Mar 15, 2025
windows_prepare_llm_models.bat	windows_prepare_llm_models.bat	Introduce VLM pipeline support (#3095 )	Mar 6, 2025
windows_prepare_python.bat	windows_prepare_python.bat	Switch Windows to python311 (#2969 )	Jan 20, 2025
windows_set_ovms_version.py	windows_set_ovms_version.py	Update OV dependencies to 2025.1 dev (#3058 )	Feb 20, 2025
windows_setupvars.bat	windows_setupvars.bat	Windows dev setupvars. (#2990 )	Feb 4, 2025
windows_test.bat	windows_test.bat	More unit tests unskipped for windows (#3078 )	Feb 25, 2025
yarn.lock	yarn.lock	Mediapipe visibility patch (#1706 )	Apr 20, 2023

Repository files navigation

OpenVINO™ Model Server

Model Server hosts models and makes them accessible to software components over standard network protocols: a client sends a request to the model server, which performs model inference and sends a response back to the client. Model Server offers many advantages for efficient model deployment:

Remote inference enables using lightweight clients with only the necessary functions to perform API calls to edge or cloud deployments.
Applications are independent of the model framework, hardware device, and infrastructure.
Client applications in any programming language that supports REST or gRPC calls can be used to run inference remotely on the model server.
Clients require fewer updates since client libraries change very rarely.
Model topology and weights are not exposed directly to client applications, making it easier to control access to the model.
Ideal architecture for microservices-based applications and deployments in cloud environments – including Kubernetes and OpenShift clusters.
Efficient resource utilization with horizontal and vertical inference scaling.

OpenVINO™ Model Server (OVMS) is a high-performance system for serving models. Implemented in C++ for scalability and optimized for deployment on Intel architectures. It uses the same API as TensorFlow Serving and KServe while applying OpenVINO for inference execution. Inference service is provided via gRPC or REST API, making deploying new algorithms and AI experiments easy.

In addition, there are included endpoints for generative use cases compatible with OpenAI API and Cohere API.

The models used by the server need to be stored locally or hosted remotely by object storage services. For more details, refer to Preparing Model Repository documentation. Model server works inside Docker containers, on Bare Metal, and in Kubernetes environment. Start using OpenVINO Model Server with a fast-forward serving example from the QuickStart guide or LLM QuickStart guide.

Read release notes to find out what’s new.

Key features:

[NEW] Native Windows support. Check updated deployment guide
[NEW] Text Embeddings compatible with OpenAI API
[NEW] Reranking compatible with Cohere API
[NEW] Efficient Text Generation via OpenAI API
Python code execution
gRPC streaming
MediaPipe graphs serving
Model management - including model versioning and model updates in runtime
Dynamic model inputs
Directed Acyclic Graph Scheduler along with custom nodes in DAG pipelines
Metrics - metrics compatible with Prometheus standard
Support for multiple frameworks, such as TensorFlow, PaddlePaddle and ONNX
Support for AI accelerators

Check full list of features

Note: OVMS has been tested on RedHat, Ubuntu and Windows. Public docker images are stored in:

Dockerhub
RedHat Ecosystem Catalog Binary packages for Linux and Windows are on Github

Run OpenVINO Model Server

A demonstration on how to use OpenVINO Model Server can be found in our quick-start guide for vision use case and LLM text generation.

Check also other instructions:

Preparing model repository

Deployment

Writing client code

Demos

References

Contact

If you have a question, a feature request, or a bug report, feel free to submit a Github issue.

* Other names and brands may be claimed as the property of others.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenVINO™ Model Server

Key features:

Run OpenVINO Model Server

References

Contact

About

Releases 34

Packages

Contributors 54

Languages

License

openvinotoolkit/model_server

Folders and files

Latest commit

History

Repository files navigation

OpenVINO™ Model Server

Key features:

Run OpenVINO Model Server

References

Contact

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases 34

Packages 0

Contributors 54

Languages

Packages