Tags · zcyKTH/cudnn-frontend

v0.5.1

V0.5.1 patch (NVIDIA#24)

* Update the timing code in the cudnn find plan to include the stream-ID on which it was launched.

* Fix a typo in CMakelist.txt.

* Fix compilation warnings in multiple files with latest GCC.

Co-authored-by: Anerudhan Gopal <[email protected]>

Jan 25, 2022
7b83dba
zip
tar.gz

v0.5

cudnn_frontend Version 0.5

Nov 12, 2021
704a61f
zip
tar.gz

v0.4.1

Patch 0.4.1 (NVIDIA#7)

* Release 0.4.1

[Bug Fix] : Fixed an issue where the vector count was not copied over during move construction phase.
[Samples]: New sample added for IMMA. Added an errrata filter which blocks non-TensorCore engine from running it.
[CleanUp]: Change all move constructors and fixed move assignment operator.

* Rename getDimension in Convolution to spatial dimension for clarity

Co-authored-by: agopal <[email protected]>

Aug 11, 2021
8360d4a
zip
tar.gz

v0.4

[New API] : Added a new function get_heuristics_list which accepts a …

…list of heuristics mode and returns a concatenated list of the engine heuristics.

[New Feature]: New mode of heuristic (HEUR_MODE_FALLBACK] added to the backend. Sample updated to use that and provides a generic way to access the fallback engines. FallbackEngineList is retained as a way to add custom engines in the frontend.
[New Feature]: Added support to set vectorization dimension and vectorization count attributes in the tensor descriptor.
[Rename]: setDataType in OperationBuilder deprecated and replaced with more clear setComputePrecision()
[CleanUp] : cudnnFindPlan and cudnnGetPlan takes L-value operationGraph rather than previously R-value.
[CleanUp] : cudnnFindPlan and time_sorted_plan return executionPlans_t (which is a vector plans) instead of executionOptions_t (which is a vector of struct containing plan and time). This is to achieve compatibility with the cudnnGet.
[Samples]: New sample added for DP4A.
[Samples]: ConvBiasScaleRelu sample|
[Bug fix]: Errata filter was erroneously filtering out unspecified engines.

Jul 1, 2021
73210a9
zip
tar.gz

v0.3.1

a) Adding status check on the cudnnBackendExecute during warm up. b) …

…Adding status check on json_handle when loading from a file (NVIDIA#5)

Co-authored-by: agopal <[email protected]>

Jun 8, 2021
949f2ac
zip
tar.gz

v0.3

Update the documentation

May 16, 2021
51e60d8
zip
tar.gz

v0.2

Merge pull request NVIDIA#2 from NVIDIA/staging

Changes in pull request:

    Fix compilation warnings reported with -Wall and -Wextra flags
    Support for backward activations dx = f(dy, X).
    Support for lower_clip, upper_clip, lower_clip_slope and alpha and beta paramters for relu, elu, softplus and swish.
    Added additional checks during build phase. Such as for bDesc being nullptr etc.
    Improved error checking for xDesc, yDesc depending on whether the operation is convolution or pointwise.
    Add matmul descriptor
    Add conv_scale_bias_add_relu and matmul_bias_gelu sample
    Comparison between frontend and backend
    Fix compilation issue in samples for gcc-5
    New sample for HEUR_B

Mar 29, 2021
b4e1ad9
zip
tar.gz

v8.1.0-beta

Update README.md

Jan 28, 2021
360d6e7
zip
tar.gz

v0.1

Update README.md

Jan 28, 2021
360d6e7
zip
tar.gz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.5.1

v0.5

v0.4.1

v0.4

v0.3.1

v0.3

v0.2

v8.1.0-beta

v0.1

Tags: zcyKTH/cudnn-frontend