We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi, I have an issue about Transter Learning using SparseML by following instructions in https://github.com/neuralmagic/sparseml/blob/main/integrations/ultralytics-yolov8/tutorials/sparse-transfer-learning.md.
More specific, I trained:
sparseml.ultralytics.train \ --model "zoo:cv/detection/yolov8-m/pytorch/ultralytics/coco/pruned80-none" \ --recipe "zoo:cv/detection/yolov8-m/pytorch/ultralytics/voc/pruned80_quant-none" \ --data "coco128.yaml" \ --batch 2
and then export the trained model:
sparseml.ultralytics.export_onnx \ --model ./runs/detect/train/weights/last.pt \ --save_dir yolov8-m
And then run benchmark using Deepsparse:
>> deepsparse.benchmark /home/ubuntu/code/models/trained_model.onnx 2025-03-03 03:23:56 deepsparse.benchmark.helpers INFO Thread pinning to cores enabled DeepSparse, Copyright 2021-present / Neuralmagic, Inc. version: 1.8.0 COMMUNITY | (e3778e93) (release) (optimized) (system=avx512_vnni, binary=avx512) 2025-03-03 03:23:56 deepsparse.benchmark.benchmark_model INFO deepsparse.engine.Engine: onnx_file_path: /home/ubuntu/code/models/trained_model.onnx batch_size: 1 num_cores: 4 num_streams: 1 scheduler: Scheduler.default fraction_of_supported_ops: 0.0 cpu_avx_type: avx512 cpu_vnni: True 2025-03-03 03:23:56 deepsparse.utils.onnx INFO Generating input 'images', type = uint8, shape = [1, 3, 640, 640] 2025-03-03 03:23:56 deepsparse.benchmark.benchmark_model INFO Starting 'singlestream' performance measurements for 10 seconds Original Model Path: /home/ubuntu/code/models/trained_model.onnx Batch Size: 1 Scenario: sync Throughput (items/sec): 4.1084 Latency Mean (ms/batch): 243.3896 Latency Median (ms/batch): 240.5514 Latency Std (ms/batch): 10.9256 Iterations: 42
And here are related dependencies and training environment Libraries:
Training Environment:
It is quite slow. I suspect that it is about fraction_of_supported_ops: 0.0 related to the benchmark result, because I run benchmark on the pretrained weight used to train in the training command mentioned (get from https://sparsezoo.neuralmagic.com/models/yolov8-m-coco-pruned80_quantized?hardware=deepsparse-c6i.12xlarge&comparison=yolov8-m-coco-base).
fraction_of_supported_ops: 0.0
>> deepsparse.benchmark /home/ubuntu/code/models/pretrained_model.onnx 2025-03-03 03:52:06 deepsparse.benchmark.helpers INFO Thread pinning to cores enabled DeepSparse, Copyright 2021-present / Neuralmagic, Inc. version: 1.8.0 COMMUNITY | (e3778e93) (release) (optimized) (system=avx512_vnni, binary=avx512) 2025-03-03 03:52:07 deepsparse.benchmark.benchmark_model INFO deepsparse.engine.Engine: onnx_file_path: /home/ubuntu/code/models/pretrained_model.onnx batch_size: 1 num_cores: 4 num_streams: 1 scheduler: Scheduler.default fraction_of_supported_ops: 1.0 cpu_avx_type: avx512 cpu_vnni: True 2025-03-03 03:52:08 deepsparse.utils.onnx INFO Generating input 'images', type = uint8, shape = [1, 3, 640, 640] 2025-03-03 03:52:08 deepsparse.benchmark.benchmark_model INFO Starting 'singlestream' performance measurements for 10 seconds Original Model Path: /home/ubuntu/code/models/pretrained_model.onnx Batch Size: 1 Scenario: sync Throughput (items/sec): 25.9231 Latency Mean (ms/batch): 38.5548 Latency Median (ms/batch): 38.2803 Latency Std (ms/batch): 1.4339 Iterations: 260
I found out that fraction_of_supported_ops is 1.0.
fraction_of_supported_ops
1.0
Then I searched about this, I found that is about optimized runtime as described in https://github.com/neuralmagic/deepsparse/blob/36b92eeb730a74a787cea467c9132eaa1b78167f/src/deepsparse/engine.py#L417, and that's it.
I have some questions:
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Hi, I have an issue about Transter Learning using SparseML by following instructions in https://github.com/neuralmagic/sparseml/blob/main/integrations/ultralytics-yolov8/tutorials/sparse-transfer-learning.md.
More specific, I trained:
and then export the trained model:
And then run benchmark using Deepsparse:
And here are related dependencies and training environment
Libraries:
Training Environment:
It is quite slow. I suspect that it is about
fraction_of_supported_ops: 0.0
related to the benchmark result, because I run benchmark on the pretrained weight used to train in the training command mentioned (get from https://sparsezoo.neuralmagic.com/models/yolov8-m-coco-pruned80_quantized?hardware=deepsparse-c6i.12xlarge&comparison=yolov8-m-coco-base).I found out that
fraction_of_supported_ops
is1.0
.Then I searched about this, I found that is about optimized runtime as described in https://github.com/neuralmagic/deepsparse/blob/36b92eeb730a74a787cea467c9132eaa1b78167f/src/deepsparse/engine.py#L417, and that's it.
I have some questions:
fraction_of_supported_ops
?fraction_of_supported_ops
?fraction_of_supported_ops
affect to the benchmark result?The text was updated successfully, but these errors were encountered: