Skip to content

Commit

Permalink
Merge branch 'master' into users/dan/batch_postprocessing
Browse files Browse the repository at this point in the history
# Conflicts:
#	visualization/visualization_utils.py
  • Loading branch information
agentmorris committed May 12, 2019
2 parents 375c472 + 6e4c2ff commit bcddfa1
Show file tree
Hide file tree
Showing 13 changed files with 523 additions and 89 deletions.
32 changes: 22 additions & 10 deletions api/detector_batch_processing/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,23 +57,24 @@ Not yet supported. Meanwhile, once the shards of images are submitted for proces

### Inputs

| Parameter | Is required | Explanation |
|--------------------------|-------------|-------------------------------------------------------------------------------------------------------------------------------|
| input_container_sas | Yes | SAS URL with list and read permissions to the Blob Storage container where the images are stored. |
| images_required_json_sas | No | SAS URL with list and read permissions to a json file in Blob Storage. The json contains a list, where each item (a string) in the list is the full path to an image from the root of the container. An example of the content of this file: `["Season1/Location1/Camera1/image1.jpg", "Season1/Location1/Camera1/image2.jpg"]`. Only images whose paths are listed here will be processed. |
| image_path_prefix | No | Only process images whose full path starts with `image_path_prefix`. Note that any image paths specified in `images_required_json_sas` will need to be the full path from the root of the container, regardless of `image_path_prefix`. |
| first_n | No | Only process the first `first_n` images. Order of images is not guaranteed, but is likely to be alphabetical. Set this to a small number to avoid taking time to fully list all images in the blob (about 15 minutes for 1 million images) if you just want to try this API. |
| sample_n (not yet implemented) | No | Randomly sample `sample_n` images to process. |
| Parameter | Is required | Type | Explanation |
|--------------------------|-------------|-------|-------------------------------------------------------------------------------------------------------------------------------|
| input_container_sas | Yes | string | SAS URL with list and read permissions to the Blob Storage container where the images are stored. |
| images_requested_json_sas | No | string | SAS URL with list and read permissions to a json file in Blob Storage. The json contains a list, where each item (a string) in the list is the full path to an image from the root of the container. An example of the content of this file: `["Season1/Location1/Camera1/image1.jpg", "Season1/Location1/Camera1/image2.jpg"]`. Only images whose paths are listed here will be processed. |
| image_path_prefix | No | string | Only process images whose full path starts with `image_path_prefix` (case-_sensitive_). Note that any image paths specified in `images_requested_json_sas` will need to be the full path from the root of the container, regardless whether `image_path_prefix` is provided. |
| first_n | No | int | Only process the first `first_n` images. Order of images is not guaranteed, but is likely to be alphabetical. Set this to a small number to avoid taking time to fully list all images in the blob (about 15 minutes for 1 million images) if you just want to try this API. |
| sample_n | No |int | Randomly select `sample_n` images to process. |


- We assume that all images you would like to process in this batch are uploaded to a container in Azure Blob Storage.
- Only images with file name ending in '.jpg' or '.jpeg' (case insensitive) will be processed, so please make sure the file names are compliant before you upload them to the container (you cannot rename a blob without copying it entirely once it is in Blob Storage).
- The path to the images in blob storage cannot contain commas (this would confuse the output CSV).

- By default we process all such images in the specified container. You can choose to only process a subset of them by specifying the other input parameters, and the images will be filtered out accordingly in this order:
- `images_requested_json_sas`
- `image_path_prefix`
- `first_n`
- `sample_n` (not yet implemented)
- `sample_n`

- For example, if you specified both `images_requested_json_sas` and `first_n`, only images that are in your provided list at `images_requested_json_sas` will be considered, and then we process the `first_n` of those.

Expand Down Expand Up @@ -151,12 +152,23 @@ The second column is the confidence value of the most confident detection on the
The third column contains details of the detections so you can visualize them. It is a stringfied json of a list of lists, representing the detections made on that image. Each detection list has the coordinates of the bounding box surrounding the detection, followed by its confidence:

```
[ymin, xmin, ymax, xmax, confidence]
[ymin, xmin, ymax, xmax, confidence, (class)]
```

where `(xmin, ymin)` is the upper-left corner of the detection bounding box. The coordinates are relative to the height and width of the image.

When the detector model detects no animal, the confidence is shown as 0.0 (not confident that there is an animal) and the detection column is an empty list.
An integer `class` comes after `confidence` in versions of the API that uses MegaDetector version 3 or later. The `class` label corresponds to the following:

```
1: animal
2: person
4: vehicle
```

Note that the `vehicle` class (available in Mega Detector version 4 or later) is number 4. Class number 3 (group) is not included in training and should be ignored (and so should any other class labels not listed here) if it shows up in the result.

When the detector model detects no animal (or person or vehicle), the confidence is shown as 0.0 (not confident that there is an object of interest) and the detection column is an empty list.


## Post-processing tools

Expand Down
6 changes: 4 additions & 2 deletions api/detector_batch_processing/api/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@ FROM ai4eregistry.azurecr.io/1.0-base-py-ubuntu16.04:latest
RUN echo "source activate ai4e_py_api" >> ~/.bashrc \
&& conda install -c conda-forge -n ai4e_py_api numpy pandas

RUN pip install --upgrade pip

# Azure blob packages should already be installed in the base image. Just need to install Azure ML SDK
RUN pip install --upgrade azureml-sdk
RUN pip install azureml-sdk==1.0.33

# Note: supervisor.conf reflects the location and name of your api code.
# If the default (./my_api/runserver.py) is renamed, you must change supervisor.conf
Expand Down Expand Up @@ -35,7 +37,7 @@ ENV SERVICE_OWNER=AI4E_Test \
SERVICE_MODEL_FRAMEOWRK_VERSION=3.6.6 \
SERVICE_MODEL_VERSION=1.0

ENV API_PREFIX=/v1/camera-trap/detection-batch
ENV API_PREFIX=/v2/camera-trap/detection-batch

ENV AZUREML_PASSWORD=

Expand Down
10 changes: 5 additions & 5 deletions api/detector_batch_processing/api/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ You can do this at the command line (where Azure CLI is installed) of the VM whe
```
az acr login --name ai4eregistry
```
You need to have the subscription where this registry is set as the default subscription.
You need to have the subscription where this registry is set as the default subscription, and you may need to use `sudo` with this command.


### Step 3. Build the Docker container
Expand All @@ -45,7 +45,7 @@ Now you're all set to build the container.
Navigate to the current directory (`detector_batch_processing/api`) where the `Dockerfile` is.

```
docker build . -t name.azurecr.io/camera-trap-detection-sync:2
docker build . -t name.azurecr.io/camera-trap-detection-batch-v3:1
```

You can supply your own tag (`-t` option) and build number. You may need to use `sudo` with this command.
Expand All @@ -55,7 +55,7 @@ You can supply your own tag (`-t` option) and build number. You may need to use
To launch the service, in a `tmux` session, issue:

```
docker run -p 6000:80 name.azurecr.io/camera-trap-detection-batch:2 |& tee -a camera-trap-api-async-log/log20190415.txt
docker run -p 6000:80 name.azurecr.io/camera-trap-detection-batch-v3:1 |& tee -a camera-trap-api-async-log/log20190415.txt
```

Substitute the tag of the image you built in the last step (or that of a pre-built one), the port you'd like to expose the API at (6000 above), and specify the location to store the log messages (printed to console too).
Expand All @@ -65,6 +65,6 @@ You may need to use `sudo` with this command.

## Work items

- [ ] Rename `aml_config_scripts` to `aml_scripts` now that the cluster config file is no longer used
- [x] Rename `aml_config_scripts` to `aml_scripts` now that the cluster config file is no longer used

- [ ] Make use of Key Vault to access crendentials
- [ ] Make use of Key Vault to access credentials
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
print('score.py, beginning - NEW')
print('score.py, beginning - megadetector_v3')

import argparse
import csv
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -124,10 +124,11 @@ def generate_detections_batch(self, images, image_ids, batch_size, detection_thr
boxes, scores, classes = b_box[i], b_score[i], b_class[i]
detections_cur_image = [] # will be empty for an image with no confident detections

for b, s in zip(boxes, scores):
for b, s, c in zip(boxes, scores, classes):
if s > detection_threshold:
li = TFDetector.convert_numpy_floats(b)
li.append(float(s))
li.append(int(c))
detections_cur_image.append(li)

detections.append(detections_cur_image)
Expand Down
14 changes: 7 additions & 7 deletions api/detector_batch_processing/api/orchestrator_api/api_config.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
# version of the detector model in use
MODEL_VERSION = 'models/object_detection/faster_rcnn_inception_resnet_v2_atrous/megadetector'
MODEL_VERSION = 'models/object_detection/faster_rcnn_inception_resnet_v2_atrous/megadetector_v3/step_686872'

# name of the container in the internal storage account to store user facing files:
# image list, detection results and failed images list.
INTERNAL_CONTAINER = 'async-api-v2'
INTERNAL_CONTAINER = 'async-api-v3-2'

# name of the container in the internal storage account to store outputs of each AML job
AML_CONTAINER = 'aml-out'
AML_CONTAINER = 'aml-out-2'

# how often does the checking thread wake up to check if all jobs are done
MONITOR_PERIOD_MINUTES = 30
MONITOR_PERIOD_MINUTES = 15

# if this number of times the thread wakes up to check is exceeded, stop the monitoring thread
MAX_MONITOR_CYCLES = 14 * 48 # 2 weeks, 30-minute interval
Expand All @@ -31,12 +31,12 @@
'subscription_id': '74d91980-e5b4-4fd9-adb6-263b8f90ec5b',
'workspace_region': 'eastus',
'resource_group': 'camera_trap_api_rg',
'workspace_name': 'camera_trap_aml_workspace',
'workspace_name': 'camera_trap_aml_workspace_2',
'aml_compute_name': 'camera-trap-com',

'model_name': 'megadetector',
'model_name': 'megadetector_v3_tf19',

'source_dir': '/app/orchestrator_api/aml_config_scripts',
'source_dir': '/app/orchestrator_api/aml_scripts',
'script_name': 'score.py',

'param_batch_size': 8,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
import copy
import io
import os
import pickle
from collections import defaultdict
from datetime import datetime, timedelta

import azureml.core
import pandas as pd
from azure.storage.blob import BlockBlobService, BlobPermissions
from azureml.core import Workspace, Experiment
Expand All @@ -19,6 +21,9 @@
import api_config
from sas_blob_utils import SasBlob

print('Version of AML: {}'.format(azureml.core.__version__))


# Service principle authentication for AML
svc_pr_password = os.environ.get('AZUREML_PASSWORD')
svc_pr = ServicePrincipalAuthentication(
Expand Down Expand Up @@ -97,6 +102,7 @@ def __init__(self, request_id, input_container_sas, internal_datastore):

batch_score_step = PythonScriptStep(aml_config['script_name'],
source_directory=aml_config['source_dir'],
hash_paths= ['.'], # include all contents of source_directory
name='batch_scoring',
arguments=['--job_id', param_job_id,
'--model_name', aml_config['model_name'],
Expand Down Expand Up @@ -272,7 +278,7 @@ def _generate_urls_for_outputs(self):
sas = self.internal_storage_service.generate_blob_shared_access_signature(
self.internal_container, blob_path, permission=BlobPermissions.READ, expiry=expiry
)
url = self.internal_storage_service.make_blob_url('async-api-v2', blob_path, sas_token=sas)
url = self.internal_storage_service.make_blob_url(self.internal_container, blob_path, sas_token=sas)
urls[output] = url
return urls
except Exception as e:
Expand Down
39 changes: 31 additions & 8 deletions api/detector_batch_processing/api/orchestrator_api/runserver.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import os
import time
from datetime import datetime
from random import shuffle

from ai4e_app_insights import AppInsights
from ai4e_app_insights_wrapper import AI4EAppInsights
Expand All @@ -23,6 +24,7 @@
print('Creating application')

api_prefix = os.getenv('API_PREFIX')
print('API prefix: ', api_prefix)
app = Flask(__name__)
api = Api(app)

Expand Down Expand Up @@ -105,7 +107,7 @@ def _request_detections(**kwargs):

input_container_sas = body['input_container_sas']
images_requested_json_sas = body.get('images_requested_json_sas', None)
image_path_prefix = body.get('image_path_prefix', '')
image_path_prefix = body.get('image_path_prefix', None)

first_n = body.get('first_n', None)
first_n = int(first_n) if first_n else None
Expand All @@ -120,30 +122,48 @@ def _request_detections(**kwargs):
print('runserver.py, running - listing all images to process.')

# list all images to process
blob_prefix = None if image_path_prefix is None else image_path_prefix
image_paths = SasBlob.list_blobs_in_container(api_config.MAX_NUMBER_IMAGES_ACCEPTED,
sas_uri=input_container_sas,
blob_prefix=image_path_prefix, blob_suffix='.jpg')
blob_prefix=blob_prefix, blob_suffix='.jpg')
else:
print('runserver.py, running - using provided list of images.')
image_paths_text = SasBlob.download_blob_to_text(images_requested_json_sas)
image_paths = json.loads(image_paths_text)
print('runserver.py, length of image_paths provided by the user: {}'.format(len(image_paths)))

image_paths = [i for i in image_paths if str(i).lower().endswith(api_config.ACCEPTED_IMAGE_FILE_ENDINGS)]
print('runserver.py, length of image_paths provided by the user, after filtering to jpg: {}'.format(len(image_paths)))
print('runserver.py, length of image_paths provided by the user, after filtering to jpg: {}'.format(
len(image_paths)))

if image_path_prefix is not None:
image_paths =[i for i in image_paths if str(i).startswith(image_path_prefix)]
print('runserver.py, length of image_paths provided by the user, after filtering for image_path_prefix: {}'.format(
len(image_paths)))

res = orchestrator.spot_check_blob_paths_exist(image_paths, input_container_sas)
if res is not None:
raise LookupError('failed - path {} provided in list of images to process does not exist in the container pointed to by data_container_sas.'.format(res))
raise LookupError('path {} provided in list of images to process does not exist in the container pointed to by data_container_sas.'.format(res))

# apply the first_n and sample_n filters
if first_n is not None:
assert first_n > 0, 'parameter first_n is zero.'
assert first_n > 0, 'parameter first_n is 0.'
image_paths = image_paths[:first_n] # will not error if first_n > total number of images

# TODO implement sample_n - need to check that sample_n <= len(image_paths)
if sample_n is not None:
assert sample_n > 0, 'parameter sample_n is 0.'
if sample_n > len(image_paths):
raise ValueError('parameter sample_n specifies more images than available (after filtering by other provided params).')

# we sample by just shuffling the image paths and take the first sample_n images
print('First path before shuffling:', image_paths[0])
shuffle(image_paths)
print('First path after shuffling:', image_paths[0])
image_paths = image_paths[:sample_n]
image_paths = sorted(image_paths)

num_images = len(image_paths)
print('runserver.py, num_images: {}'.format(num_images))
print('runserver.py, num_images after applying all filters: {}'.format(num_images))
if num_images < 1:
api_task_manager.UpdateTaskStatus(request_id, 'completed - zero images found in container or in provided list of images after filtering with the provided parameters.')
return
Expand Down Expand Up @@ -173,7 +193,7 @@ def _request_detections(**kwargs):
begin, end = job_index * num_images_per_job, (job_index + 1) * num_images_per_job
job_id = 'request{}_jobindex{}_total{}'.format(request_id, job_index, num_jobs)
list_jobs[job_id] = { 'begin': begin, 'end': end }
# TODO send list_jobs_submitted in a pickle to intermediate storage as a record / for restarting the monitoring thread

list_jobs_submitted = aml_compute.submit_jobs(request_id, list_jobs, api_task_manager, num_images)
api_task_manager.UpdateTaskStatus(request_id,
'running - all {} images submitted to cluster for processing.'.format(num_images))
Expand All @@ -186,6 +206,7 @@ def _request_detections(**kwargs):

try:
aml_monitor = orchestrator.AMLMonitor(request_id, list_jobs_submitted)

# start another thread to monitor the jobs and consolidate the results when they finish
ai4e_wrapper.wrap_async_endpoint(_monitor_detections_request, 'post:_monitor_detections_request',
request_id=request_id,
Expand All @@ -205,6 +226,8 @@ def _monitor_detections_request(**kwargs):
max_num_checks = api_config.MAX_MONITOR_CYCLES
num_checks = 0

print('Monitoring thread with _monitor_detections_request started.')

# time.sleep() blocks the current thread only
while True:
time.sleep(api_config.MONITOR_PERIOD_MINUTES * 60)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ def list_blobs_in_container(max_number_to_list, sas_uri=None, datastore=None,
sas_uri: Azure blob storage SAS token
datastore: dict with fields account_name (of the Blob storage account), account_key and container_name
blob_prefix: Optional, a string as the prefix to blob names/paths to filter the results to those
with this prefix
with this prefix. Case-sensitive!
blob_suffix: Optional, an all lower case string or a tuple of strings, to filter the results to
those with this/these suffix(s).
The blob names will be lowercased first before comparing with the suffix(es).
Expand Down
Loading

0 comments on commit bcddfa1

Please sign in to comment.