Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[PyTorch] Optimize no input NVTX collection (pytorch#70133)
Summary: Pull Request resolved: pytorch#70133 we were creating `sstream` + string concats via `getNvtxStr` even when there were no inputs and wasting precious time. this diff avoids `stringstream` when there is no input to squeeze performance. 60% reduction in overhead Test Plan: Before ``` I1214 22:48:07.964118 2971180 bench.cpp:154] Mean 0.970494 I1214 22:48:07.964139 2971180 bench.cpp:155] Median 0.969054 I1214 22:48:07.964144 2971180 bench.cpp:156] Min 0.962247 I1214 22:48:07.964148 2971180 bench.cpp:157] stddev 0.00774841 I1214 22:48:07.964154 2971180 bench.cpp:158] stddev / mean 0.00798398 ``` After ``` I1214 22:59:00.039872 3437853 bench.cpp:154] Mean 0.384333 I1214 22:59:00.039896 3437853 bench.cpp:155] Median 0.384886 I1214 22:59:00.039899 3437853 bench.cpp:156] Min 0.370235 I1214 22:59:00.039902 3437853 bench.cpp:157] stddev 0.00435907 I1214 22:59:00.039907 3437853 bench.cpp:158] stddev / mean 0.0113419 ``` Reviewed By: aaronenyeshi, robieta Differential Revision: D33137501 fbshipit-source-id: ce0e8cf9aef7ea22fd8aed927e76be4ca375efc3
- Loading branch information