This folder contains the necessary scripts to perform a pareto front estimation for machine learning models. Currently, the scripts support target devices running on Tizen, as well as Odroid-XU4
.
The contents of the folder can be categorized into the following groups:
- Generator scripts to map decision variables to
nnpackage_run
parameters - Estimator scripts to compute pareto front
The following subsections describe the role of each script in detail.
The generator script gen_oplist.py
is located under generator
folder, and encodes large integer representations for nnpackage
backend assignments. Effectively, it maps suitable backend assignments to integer values. For example, a graph with only three operations and two backends will have a integer representation in the range (0, 7)
. Thus a value 0
might imply all operations run on cpu
, while 7
might imply that all operations run on acl_cl
backend. As will be described below, the integer representation of nnpackage
parameters serves as a convenient decision space for pareto estimation.
Setting up parameters for nnpackage_run
requires a knowledge of model-specific operations. To this end, the gen_oplist.py
script generates for each model, a oplist
of unique operations. If an exhaustive mapping of backends to operation sequences is preferred, then gen_oplist.py
also generates a so-called opmap
list for uniquely observed <operation name, data size>
pairs.
gen_oplist.py
is run on the development environment (read: Desktop PC) as shown below:
python3 gen_oplist.py <tflite model> <target>
The list of model operations and their mapping to graph node indexes are stored in a oplist.json file, and transferred to the target device. For further details about usage, type python3 gen_oplist.py --help
.
Scripts under the estimator
folder fall under two categories, namely an exhaustive, brute-force profiling, and a on-device version of pareto estimation. These are described in detail below.
For the sake of testing several pareto estimation algorithms offline on common lookup data, the generator
folder includes a brute_force_profiler.py
that records all solutions in the decision or assignment space. brute_force_profiler.py
is typically run on target device, with the following syntax:
python brute_force_profiler.py <model> <target> <run_folder> [--dumpfile=<filename>]
For details, type python brute_force_profiler.py --help
. Below is a example of the dump generated by the brute-force profiler:
{"oplist": ["Pool2D", "BinaryArithmetic", "DepthwiseConv2D", "Conv2D", "Reshape"],
"solutions": [
{"memory": 56388, "id": 0, "time": 72.525},
{"memory": 63624, "id": 1, "time": 86.532},
{"memory": 64320, "id": 2, "time": 69.352},
{"memory": 65376, "id": 3, "time": 76.436},
{"memory": 73016, "id": 4, "time": 69.634},
{"memory": 73492, "id": 5, "time": 47.013},
{"memory": 74488, "id": 6, "time": 95.01},
{"memory": 74844, "id": 7, "time": 111.329},
{"memory": 393324, "id": 8, "time": 98.956},
{"memory": 395088, "id": 9, "time": 103.24},
{"memory": 396180, "id": 10, "time": 68.107},
{"memory": 395932, "id": 11, "time": 86.109},
{"memory": 402468, "id": 12, "time": 25.477},
{"memory": 402800, "id": 13, "time": 25.42},
{"memory": 403904, "id": 14, "time": 9.168},
{"memory": 404476, "id": 15, "time": 7.801},
....
{"memory": 403940, "id": 30, "time": 9.145},
{"memory": 403568, "id": 31, "time": 8.034}]}
Note: As of present, the pareto estimation algorithms run on-device, and will support an offline mode in the near future.
Currently the estimator
folder includes only a random_sampler.py
, however, in future, it will feature a set of pareto estimation algorithms. Regardless of the algorithm, the following steps must be carried out in sequence:
-
Generate the oplist using
gen_oplist.py
, and transfer the JSON file to the target device. This step is performed on the development environment -
Copy the contents of the
estimator
folder to the target (scp for odroid, sdb push for tizen), at a preferred location -
On the target device, run the pareto-estimation algorithm. The following example shows how to run
random_sampler.py
(seepython random_sampler.py --help
for details)
python random_sampler.py /root/img_model/mobilenetv2/ /opt/usr/nnfw-test/Product/out/bin --mode=name --dumpfile=/tmp/mobilenetv2_opname_profile.json --iterations=20
After profiling, the results can be viewed under the filename provided by the --dumpfile
argument. Below is an illustrative example of the same model that was brute-forced above:
{"configs": {
"4": "BACKENDS=\"acl_cl;cpu\" OP_BACKEND_Pool2D=cpu OP_BACKEND_DepthwiseConv2D=cpu OP_BACKEND_Reshape=acl_cl OP_BACKEND_Conv2D=cpu OP_BACKEND_BinaryArithmetic=cpu ",
"10": "BACKENDS=\"acl_cl;cpu\" OP_BACKEND_Pool2D=cpu OP_BACKEND_DepthwiseConv2D=acl_cl OP_BACKEND_Reshape=cpu OP_BACKEND_Conv2D=acl_cl OP_BACKEND_BinaryArithmetic=cpu ",
"14": "BACKENDS=\"acl_cl;cpu\" OP_BACKEND_Pool2D=cpu OP_BACKEND_DepthwiseConv2D=acl_cl OP_BACKEND_Reshape=acl_cl OP_BACKEND_Conv2D=acl_cl OP_BACKEND_BinaryArithmetic=cpu ",
"16": "BACKENDS=\"acl_cl;cpu\" OP_BACKEND_Pool2D=cpu OP_BACKEND_DepthwiseConv2D=cpu OP_BACKEND_Reshape=cpu OP_BACKEND_Conv2D=cpu OP_BACKEND_BinaryArithmetic=acl_cl ",
"20": "BACKENDS=\"acl_cl;cpu\" OP_BACKEND_Pool2D=cpu OP_BACKEND_DepthwiseConv2D=cpu OP_BACKEND_Reshape=acl_cl OP_BACKEND_Conv2D=cpu OP_BACKEND_BinaryArithmetic=acl_cl ",
"21": "BACKENDS=\"acl_cl;cpu\" OP_BACKEND_Pool2D=acl_cl OP_BACKEND_DepthwiseConv2D=cpu OP_BACKEND_Reshape=acl_cl OP_BACKEND_Conv2D=cpu OP_BACKEND_BinaryArithmetic=acl_cl ",
"31": "BACKENDS=\"acl_cl;cpu\" OP_BACKEND_Pool2D=acl_cl OP_BACKEND_DepthwiseConv2D=acl_cl OP_BACKEND_Reshape=acl_cl OP_BACKEND_Conv2D=acl_cl OP_BACKEND_BinaryArithmetic=acl_cl "},
"oplist": ["Pool2D", "DepthwiseConv2D", "Reshape", "Conv2D", "BinaryArithmetic"],
"solutions": [
{"exec_time": 76.138, "max_rss": 62712, "id": 4},
{"exec_time": 72.719, "max_rss": 65272, "id": 16},
{"exec_time": 22.409, "max_rss": 403120, "id": 14},
{"exec_time": 28.138, "max_rss": 403064, "id": 10},
{"exec_time": 70.656, "max_rss": 65536, "id": 20},
{"exec_time": 68.805, "max_rss": 66076, "id": 21},
{"exec_time": 8.201, "max_rss": 404656, "id": 31}], "mode": "name"}
Note: The pareto-estimation algorithms require the use of python numpy
package, so make sure to install it beforehand.