dynamic-batcher

dynamic_batcher is designed for inferencing DL models using GPU and enforces model's concurrency.

Installation

pip install dynamic_batcher

Quickstart

Additional Requirements

pip install -r requirements-test.txt

Run

redis RUN:

docker run --rm -p 6379:6379 -e ALLOW_EMPTY_PASSWORD=yes bitnami/redis:latest

app
ENV:

REDIS__HOST=localhost
REDIS__PORT=6379

RUN:

gunicorn e2e.app.main:app \
  -k=uvicorn.workers.UvicornWorker \
  --workers=4

batcher
ENV:

REDIS__HOST=localhost
REDIS__PORT=6379

DYNAMIC_BATCHER__BATCH_SIZE=64
DYNAMIC_BATCHER__BATCH_TIME=2

RUN:

python e2e/batcher/run.py

locust
RUN:
```
locust -f e2e/locust/locustfile.py
```

Test

swagger: http://localhost:8000

POST `/items/batch/{item_id}

curl -X POST http://localhost:8000/items/batch/1 \
  -H 'Content-Type: application/json' \
  -d '{
    "content": "string",
    "nested": {
      "key": "string",
      "values": [
        1,
        5,
        2
      ]
    }
  }'

{
  "data": {
    "content": "string",
    "nested": {
      "key": "string",
      "values": [
        1,
        5,
        2
      ],
      "result": [
        2,
        6,
        3
      ]
    },
    "name": "b0878740-47a8-4dd7-bfe8-9c5ed1fee4ea"
  },
  "elapsed_time": 2.551218032836914
}

locust: http://localhost:8089

Explanation

when DYNAMIC_BATCHER__BATCH_SIZE=64 and DYNAMIC_BATCHER__BATCH_TIME=2 is set,
a running BatchProcessor waits to run a batch until the amount of requests received is met(requests count=64), for the batch_time(2 seconds). If the time is up, the partial amount of requests will be processed.

Launch

start test daemon
BatchProcessor start: delay=0.001, batch_size=64, batch_time=2

Single request(concurrency=1)

batch start: 2.001/2, 1/64
batch start: 2,001/2, 1/64
...

Concurrent requests(concurrency=100)

batch start: 1.653/2, 64/64
batch start: 0.064/2, 64/64
batch start: 0.064/2, 64/64
batch start: 0.064/2, 64/64
batch start: 0.064/2, 64/64
batch start: 0.064/2, 64/64
batch start: 0.064/2, 64/64
batch start: 0.064/2, 64/64
batch start: 0.064/2, 64/64
batch start: 0.064/2, 64/64
batch start: 0.064/2, 64/64
batch start: 0.064/2, 64/64
...
batch start: 2.001/2, 36/64

Concept

ref:

https://github.com/triton-inference-server/tutorials/tree/main/Conceptual_Guide/Part_2-improving_resource_utilization#what-is-dynamic-batching

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
.vscode		.vscode
docs		docs
dynamic_batcher		dynamic_batcher
e2e		e2e
img		img
test		test
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
CHANGELOG.md		CHANGELOG.md
README.md		README.md
README.rst		README.rst
build-sphinx.sh		build-sphinx.sh
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements-test.txt		requirements-test.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dynamic-batcher

Installation

Quickstart

Additional Requirements

Run

Test

Explanation

Concept

About

Releases 3

Packages

Languages

pydemia/dynamic-batcher

Folders and files

Latest commit

History

Repository files navigation

dynamic-batcher

Installation

Quickstart

Additional Requirements

Run

Test

Explanation

Concept

About

Resources

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages