Code to make running model inference easy in different backends. You don't have to worry about input/output names, dtypes and batch size. ONNX and Triton Inference Server are now available.
- Run
run_env.sh
for building docker image and run docker container. - In docker environment from
example
directory runpython3.8 prepare_model.py
for export resnet18 to onnx with dynamic and static batch size - Out of docker environment from
example
directory runrun_triton.sh
for start Triton Inference Server with two exported models - In docker environment from workdir run
python3.8 main.py
for send batch with size 12 to models with size 8 in triton and onnx format.