Skip to content

Tags: zcyKTH/cudnn-frontend

Tags

v0.5.1

Toggle v0.5.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
V0.5.1 patch (NVIDIA#24)

* Update the timing code in the cudnn find plan to include the stream-ID on which it was launched.

* Fix a typo in CMakelist.txt.

* Fix compilation warnings in multiple files with latest GCC.

Co-authored-by: Anerudhan Gopal <[email protected]>

v0.5

Toggle v0.5's commit message
cudnn_frontend Version 0.5

v0.4.1

Toggle v0.4.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Patch 0.4.1 (NVIDIA#7)

* Release 0.4.1

[Bug Fix] : Fixed an issue where the vector count was not copied over during move construction phase.
[Samples]: New sample added for IMMA. Added an errrata filter which blocks non-TensorCore engine from running it.
[CleanUp]: Change all move constructors and fixed move assignment operator.

* Rename getDimension in Convolution to spatial dimension for clarity

Co-authored-by: agopal <[email protected]>

v0.4

Toggle v0.4's commit message
[New API] : Added a new function get_heuristics_list which accepts a …

…list of heuristics mode and returns a concatenated list of the engine heuristics.

[New Feature]: New mode of heuristic (HEUR_MODE_FALLBACK] added to the backend. Sample updated to use that and provides a generic way to access the fallback engines. FallbackEngineList is retained as a way to add custom engines in the frontend.
[New Feature]: Added support to set vectorization dimension and vectorization count attributes in the tensor descriptor.
[Rename]: setDataType in OperationBuilder deprecated and replaced with more clear setComputePrecision()
[CleanUp] : cudnnFindPlan and cudnnGetPlan takes L-value operationGraph rather than previously R-value.
[CleanUp] : cudnnFindPlan and time_sorted_plan return executionPlans_t (which is a vector plans) instead of executionOptions_t (which is a vector of struct containing plan and time). This is to achieve compatibility with the cudnnGet.
[Samples]: New sample added for DP4A.
[Samples]: ConvBiasScaleRelu sample|
[Bug fix]: Errata filter was erroneously filtering out unspecified engines.

v0.3.1

Toggle v0.3.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
a) Adding status check on the cudnnBackendExecute during warm up. b) …

…Adding status check on json_handle when loading from a file (NVIDIA#5)

Co-authored-by: agopal <[email protected]>

v0.3

Toggle v0.3's commit message
Update the documentation

v0.2

Toggle v0.2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Merge pull request NVIDIA#2 from NVIDIA/staging

Changes in pull request:

    Fix compilation warnings reported with -Wall and -Wextra flags
    Support for backward activations dx = f(dy, X).
    Support for lower_clip, upper_clip, lower_clip_slope and alpha and beta paramters for relu, elu, softplus and swish.
    Added additional checks during build phase. Such as for bDesc being nullptr etc.
    Improved error checking for xDesc, yDesc depending on whether the operation is convolution or pointwise.
    Add matmul descriptor
    Add conv_scale_bias_add_relu and matmul_bias_gelu sample
    Comparison between frontend and backend
    Fix compilation issue in samples for gcc-5
    New sample for HEUR_B

v8.1.0-beta

Toggle v8.1.0-beta's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Update README.md

v0.1

Toggle v0.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Update README.md