Calls to the Google Cloud Vision API through the google-cloud-vision python module do not work within certain distributed task setups. This issue is not limited to Celery workers, but this repository demonstrates how a Celery project cannot make calls to the the GCV API using the google-cloud-vision module.
On first launching this Celery worker, it should make a correct call to Google Cloud Vision and log a list of returned labels for cat.jpg
.
Then, whenever the Enqueuer is launched, the Celery worker should execute add(3,4)
and make an API request to Google Cloud Vision again for labels of cat.jpg
.
The first call to Google Cloud Vision from within the startup script succeeds.
Once the Enqueuer is launched, the worker's first call to add(3,4)
succeeds as well. Once the Celery Worker calls Google Cloud Vision's API from within a Celery task, the entire process hangs permanently without generating any known logs.
If you don't already have one, create a Google Cloud Platform project at https://console.cloud.google.com/. Then, navigate to the project's dashboard -- you can change projects using the dropdown navigation at the top of the window.
Next, create an API credentials file for the project that will allow it to connect to Google Cloud Vision:
- On the left navigate to APIs & Services > Credentials.
- Click "Create Credentials" at the top.
- Select "Service Account Key" and create a JSON key. This may require the creation of a service account that is capable of interacting with Google Cloud Vision.
- Download the JSON api credentials and put it in this project's root directory as
google_creds.json
.
From the command line, you can build and start the Celery worker by running
bash launch-worker.sh
(Optional) You can monitor the logs of the worker using
docker logs -f celery_worker
The launch-worker.sh
script will:
- Create a Docker virtual network for this project.
- Download and launch a Docker container for a RabbitMQ task manager (works very well for dispatching tasks to Celery workers).
- Build and launch the Celery worker Docker container according to
docker/Dockerfile_worker
The worker itself is specified in celery_project/worker.py
. When the worker is launched, it sets up the following functions:
- A local function called
gcv_label_cat()
that can only be called internally. The function uploadscat.jpg
to Google Cloud Vision's image annotation service and prints the returned labels. - A Celery task called
add(x,y)
, which will return x+y - A Celery task called
gcv_task()
, which simply callsgcv_label_cat()
directly.
Launching the worker also runs an initial call to gcv_label_cat()
, which works without issue as can be seen in the logs (a list of cat-related features are visible
Running the following will enqueue two tasks:
bash enqueue.sh
First, a call for add(3,4)
will be enqueued to RabbitMQ. This script will briefly wait until the Celery worker sends the result back via RabbitMQ (with the sum, 7).
Then, the enqueuer will add a call to gcv_task()
.
The enqueueing script's logs can be viewed by running the following, which will update as the enqueuer creates tasks and receives results:
docker logs -f celery_enqueue
All docker containers, images, and networks pertaining to this project can be deleted by running:
bash cleanup.sh