21.06 - 2021-06-23
- Update to Polygraphy v0.29.2
- Update to ONNX-GraphSurgeon v0.3.9
- Add missing model.py in
uff_custom_plugin
sample - Fix numerical errors for float type in NMS/batchedNMS plugins
- Update demoBERT input dimensions to match Triton requirement #1051
- Optimize TLT MaskRCNN plugins:
- enable fp16 precision in multilevelCropAndResizePlugin and multilevelProposeROIPlugin
- Algorithms optimization for NMS kernels and ROIAlign kernel
- Fix invalid cuda config issue when bs is larger than 32
- Fix issues found on Jetson NANO
- Add switch for batch-agnostic mode in NMS plugin
- Removed fcplugin from demoBERT to improve latency
21.05 - 2021-05-20
- Extended support for ONNX operator
InstanceNormalization
to 5D tensors - Support negative indices in ONNX
Gather
operator - Add support for importing ONNX double-typed weights as float
- ONNX-GraphSurgeon (v0.3.7) support for models with externally stored weights
- Update ONNX-TensorRT to 21.05
- Relicense ONNX-TensorRT under Apache2
- demoBERT builder fixes for multi-batch
- Speedup demoBERT build using global timing cache and disable cuDNN tactics
- Standardize python package versions across OSS samples
- Bugfixes in multilevelProposeROI and bertQKV plugin
- Fix memleaks in samples logger
21.04 - 2021-04-12
- SM86 kernels for BERT MHA plugin
- Added opset13 support for
SoftMax
,LogSoftmax
,Squeeze
, andUnsqueeze
. - Added support for the
EyeLike
andGatherElements
operators.
- Updated TensorRT version to v7.2.3.4.
- Update to ONNX-TensorRT 21.03
- ONNX-GraphSurgeon (v0.3.4) - updates fold_constants to correctly exit early.
- Set default CUDA_INSTALL_DIR #798
- Plugin bugfixes, qkv kernels for sm86
- Fixed GroupNorm CMakeFile for cu sources #1083
- Permit groupadd with non-unique GID in build containers #1091
- Avoid
reinterpret_cast
#146 - Clang-format plugins and samples
- Avoid arithmetic on void pointer in multilevelProposeROIPlugin.cpp #1028
- Update BERT plugin documentation.
- Removes extra terminate call in InstanceNorm
21.03 - 2021-03-09
- Optimized FP16 NMS/batchedNMS plugins with n-bit radix sort and based on
IPluginV2DynamicExt
ProposalDynamic
andCropAndResizeDynamic
plugins based onIPluginV2DynamicExt
- ONNX-TensorRT v21.03 update
- ONNX-GraphSurgeon v0.3.3 update
- Bugfix for
scaledSoftmax
kernel
- N/A
21.02 - 2021-02-01
- TensorRT Python API bindings
- TensorRT Python samples
- FP16 support to batchedNMSPlugin #1002
- Configurable input size for TLT MaskRCNN Plugin #986
- TensorRT version updated to 7.2.2.3
- ONNX-TensorRT v21.02 update
- Polygraphy v0.21.1 update
- PyTorch-Quantization Toolkit v2.1.0 update
- Documentation update, ONNX opset 13 support, ResNet example
- ONNX-GraphSurgeon v0.28 update
- demoBERT builder updated to work with Tensorflow2 (in compatibility mode)
- Refactor Dockerfiles for OSS container
- N/A
20.12 - 2020-12-18
- Add configurable input size for TLT MaskRCNN Plugin
- Update symbol export map for plugins
- Correctly use channel dimension when creating Prelu node
- Fix Jetson cross compilation CMakefile
- N/A
20.11 - 2020-11-20
- API documentation for ONNX-GraphSurgeon
- N/A
20.10 - 2020-10-22
- Polygraphy v0.20.13 - Deep Learning Inference Prototyping and Debugging Toolkit
- PyTorch-Quantization Toolkit v2.0.0
- Updated BERT plugins for variable sequence length inputs
- Optimized kernels for sequence lengths of 64 and 96 added
- Added Tacotron2 + Waveglow TTS demo #677
- Re-enable
GridAnchorRect_TRT
plugin with rectangular feature maps #679 - Update batchedNMS plugin to IPluginV2DynamicExt interface #738
- Support 3D inputs in InstanceNormalization plugin #745
- Added this CHANGELOG.md
- ONNX GraphSurgeon - v0.2.7 with bugfixes, new examples.
- demo/BERT bugfixes for Jetson Xavier
- Updated build Dockerfile to cuda-11.1
- Updated ClangFormat style specification according to TensorRT coding guidelines
- N/A
7.2.1 - 2020-10-20
- Polygraphy v0.20.13 - Deep Learning Inference Prototyping and Debugging Toolkit
- PyTorch-Quantization Toolkit v2.0.0
- Updated BERT plugins for variable sequence length inputs
- Optimized kernels for sequence lengths of 64 and 96 added
- Added Tacotron2 + Waveglow TTS demo #677
- Re-enable
GridAnchorRect_TRT
plugin with rectangular feature maps #679 - Update batchedNMS plugin to IPluginV2DynamicExt interface #738
- Support 3D inputs in InstanceNormalization plugin #745
- Added this CHANGELOG.md
- ONNX GraphSurgeon - v0.2.7 with bugfixes, new examples.
- demo/BERT bugfixes for Jetson Xavier
- Updated build Dockerfile to cuda-11.1
- Updated ClangFormat style specification according to TensorRT coding guidelines
- N/A