Tags: mcabbott/CUDA.jl
Tags
## CUDA v2.2.0 [Diff since v2.1.0](JuliaGPU/CUDA.jl@v2.1.0...v2.2.0) **Closed issues:** - cudnn missing after downloading artifact (JuliaGPU#521) - Downloading artifact: CUDA110 when using DiffEqFlux (JuliaGPU#542) **Merged pull requests:** - Update manifest (JuliaGPU#520) (@github-actions[bot]) - Try out Buildkite. (JuliaGPU#522) (@maleadt) - Update manifest (JuliaGPU#529) (@github-actions[bot]) - Support for / Upgrade to CUDA 11.1 update 1. (JuliaGPU#530) (@maleadt) - Fix and test svd! (JuliaGPU#531) (@maleadt) - Move more CI to Buildkite. (JuliaGPU#532) (@maleadt) - Use type symbols to generate wrapper methods (JuliaGPU#534) (@cqql) - Fully move to Buildkite. (JuliaGPU#537) (@maleadt) - Add unit_diag option for sv2! functions (JuliaGPU#540) (@amontoison) - Documentation fixes (JuliaGPU#543) (@maleadt)
## CUDA v2.1.0 [Diff since v2.0.2](JuliaGPU/CUDA.jl@v2.0.2...v2.1.0) **Closed issues:** - CUDNN convolution with Float16 always returns zeros (JuliaGPU#92) - axp(b)y! and mul! (scalar multiplication) with mixed argument types (JuliaGPU#144) - Dispatching to generic matmul instead of CUBLAS (JuliaGPU#164) - Support for Ints and Float16? (JuliaGPU#165) - Subarrays/views support (JuliaGPU#172) - Easy way to pick among multiple GPUs (JuliaGPU#174) - More prominently document JULIA_CUDA_USE_BINARYBUILDER (JuliaGPU#204) - ERROR_COOPERATIVE_LAUNCH_TOO_LARGE during tests (JuliaGPU#247) - Pkg.test error for cutensor test on Windows (JuliaGPU#422) - Runtime build improvements (JuliaGPU#456) - Fusing Wrappers (JuliaGPU#467) - Could not find nvToolsExt (libnvToolsExt.dylib.1.0 or libnvToolsExt.dylib.1) in /Users/imac/.julia/artifacts/b502baf54095dff4a69fd6aba8667124583f6929/lib (JuliaGPU#482) - mapreduce assumes commutative op (JuliaGPU#484) - SubArray Broadcast Bug in 2.0 (JuliaGPU#488) - Nested SubArray Scalar Indexing (JuliaGPU#490) - Sparse matrix * view(vector) regression in 2.0 (JuliaGPU#493) - Error transforming a reshaped 0-dimentional GPU array to a CPU array (JuliaGPU#494) - test cuda FAILURE (JuliaGPU#496) - Reshaped CuArray is not DenseCuArray (JuliaGPU#511) - assignment failure when using array slicing. (JuliaGPU#516) **Merged pull requests:** - Use the correct CUDNN scaling parameter type. (JuliaGPU#454) (@maleadt) - Fix versioned dylib discovery. (JuliaGPU#486) (@maleadt) - Move inv from GPUArrays. (JuliaGPU#487) (@maleadt) - Use dense array types in sparse wrappers. (JuliaGPU#495) (@maleadt) - Update manifest (JuliaGPU#497) (@github-actions[bot]) - Revert array wrapper union changes (JuliaGPU#498) (@maleadt) - Clean-up pointer field. (JuliaGPU#499) (@maleadt) - mapreduce: change iteration for compatibility with non-commutative operators. (JuliaGPU#500) (@maleadt) - Use versioned libcuda (JuliaGPU#502) (@maleadt) - Dynamically choose versioned libcuda (JuliaGPU#503) (@mustafaquraish) - Update multigpu.md (JuliaGPU#504) (@efmanu) - Upgrade artifacts for CUDA 11 compatibility. (JuliaGPU#506) (@maleadt) - Update dependencies. (JuliaGPU#507) (@maleadt) - Convert unsigned short ints to Cint for printf. (JuliaGPU#508) (@maleadt) - Update manifest (JuliaGPU#510) (@github-actions[bot]) - Fix reshape with missing dimensions. (JuliaGPU#512) (@maleadt) - Don't return a pointer from 'alias'. (JuliaGPU#513) (@maleadt) - Add some docs (JuliaGPU#514) (@maleadt) - Fix CUDNN-optimized activation broadcasts (JuliaGPU#515) (@maleadt) - Fix cooperative launch test. (JuliaGPU#517) (@maleadt) - Fixes for Windows (JuliaGPU#518) (@maleadt) - CUTENSOR fixes on Windows (JuliaGPU#519) (@maleadt)
## CUDA v2.0.2 [Diff since v2.0.1](JuliaGPU/CUDA.jl@v2.0.1...v2.0.2) **Closed issues:** - cu() behavior for complex floating point numbers (JuliaGPU#91) - Error when following example on using multiple GPUs on multiple processes (JuliaGPU#468) - MacOS without nvidia GPU is trying to download CUDA111 on julia nightly (JuliaGPU#469) - Drop BinaryProvider? (JuliaGPU#474) - Latest version of master doesn't work on Windows (JuliaGPU#477) - `sum(CUDA.rand(3,3))` broken (JuliaGPU#480) - copyto!() between cpu and gpu with subarrays (JuliaGPU#491) **Merged pull requests:** - Adapt to GPUCompiler changes. (JuliaGPU#458) (@maleadt) - Fix initialization of global state (JuliaGPU#471) (@maleadt) - Remove 'view' implementation. (JuliaGPU#472) (@maleadt) - Workaround new artifact"" eagerness that prevents loading on unsupported platforms (JuliaGPU#473) (@ianshmean) - Remove BinaryProvider dep. (JuliaGPU#475) (@maleadt) - typo: libcuda.dll -> libcuda.so on Linux (JuliaGPU#476) (@Alexander-Barth) - NFC array simplifications. (JuliaGPU#481) (@maleadt) - Update manifest (JuliaGPU#485) (@github-actions[bot]) - Convert AbstractArray{ComplexF64} to CuArray{ComplexF32} by default (JuliaGPU#489) (@pabloferz)
## CUDA v2.0.1 [Diff since v2.0.0](JuliaGPU/CUDA.jl@v2.0.0...v2.0.1) **Closed issues:** - Can't update (JuliaGPU#462) **Merged pull requests:** - Remove duplicate comment (JuliaGPU#464) (@blegat) - Add functionality to precompile the runtime library. (JuliaGPU#465) (@maleadt) - Update manifest (JuliaGPU#470) (@github-actions[bot])
## CUDA v2.0.0 [Diff since v1.3.3](JuliaGPU/CUDA.jl@v1.3.3...v2.0.0) **Closed issues:** - Test failure during threading tests (JuliaGPU#15) - Bad allocations in memory pool after device_reset! (JuliaGPU#16) - CuArrays can lose Blas on reshaped views (JuliaGPU#78) - allowscalar performance (JuliaGPU#87) - Indexing with a CuArrays causes a 'scalar indexing disallowed' error from checkbounds (JuliaGPU#90) - 5-arg mul! for CUSPARSE (JuliaGPU#98) - copyto!(Device, Host) uses scalar iteration in case of type mismatch (JuliaGPU#105) - Array primitives broken for CUSPARSE arrays (JuliaGPU#113) - SplittingPool: CPU allocations (JuliaGPU#117) - error while concatenating to an empty CuArray (JuliaGPU#139) - Showing sparse arrays goes wrong (JuliaGPU#146) - Improve test coverage (JuliaGPU#147) - CuArrays allocates a lot of memory on the default GPU (JuliaGPU#153) - [Feature Request] Indexing CuArray with CuArray (JuliaGPU#155) - Reshaping CuArray throws error during backpropagation (JuliaGPU#162) - Match syntax and APIs against Julia 1.0 standard libraries (JuliaGPU#163) - CURAND_STATUS_PREEXISTING_FAILURE when setting seed multiple times. (JuliaGPU#212) - RFC: converts `SparseMatrixCSC` to `CuSparseMatrixCSR` via `cu` by default (JuliaGPU#216) - Add a CuSparseMatrixCOO type (JuliaGPU#220) - Test runner stumbles over path separators (JuliaGPU#236) - Error: Invalid bitcode signature when loading CUDA.jl after precompilation (JuliaGPU#293) - Atomic operations only work on global memory (JuliaGPU#311) - Performance: cudnn algorithm selection (JuliaGPU#318) - CUSPARSE is broken in CUDA.jl 1.2 (JuliaGPU#322) - Device-side broadcast regression on 1.5 (JuliaGPU#350) - API for fast math-like mode (JuliaGPU#354) - CUDA 11.0 Update 1: cublasSetWorkspace (JuliaGPU#365) - Can't precompile CUDA.jl on Kubuntu 20.04 (JuliaGPU#396) - CuPtr should be Ptr in cudnnGetDropoutDescriptor (JuliaGPU#397) - CUDA throws OOM error when initializing API on multiple devices (JuliaGPU#398) - Cannot launch kernel with > 5 args using Dynamic Parallelism (JuliaGPU#401) - Reverse performance regression (JuliaGPU#410) - Tag for LLVM 3? (JuliaGPU#412) - CUDA not working (JuliaGPU#415) - `StatsBase.transform` fails on `CuArray` (JuliaGPU#426) - Further unification of `CUBLAS.axpy!` and `LinearAlgebra.BLAS.axpy!` (JuliaGPU#432) - size(range), length(range) and range[end] fail inside CUDA kernels (JuliaGPU#434) - InitError: Cannot use memory pool 'binned' when CUDA.jl was precompiled for memory pool 'split'. (JuliaGPU#446) - Missing dispatch for matrix multiplication with views? (JuliaGPU#448) - New version not available yet? (JuliaGPU#452) - using CUDA or CUArray, output: UndefVarError: AddrSpacePtr not defined (JuliaGPU#457) - Unable to upgrade to the latest version (JuliaGPU#459) **Merged pull requests:** - Performance improvements by calling cuDNN API (JuliaGPU#321) (@gartangh) - Use ccall wrapper for correct pointer type conversions (JuliaGPU#392) (@maleadt) - Simplify Statistics.var and fix dims=tuple. (JuliaGPU#393) (@maleadt) - Adapt to GPUArrays test change. (JuliaGPU#394) (@maleadt) - Default to per-thread stream semantics (JuliaGPU#395) (@maleadt) - Add a missing context argument for stateless codegen. (JuliaGPU#399) (@maleadt) - Keep track of package latency timings. (JuliaGPU#400) (@maleadt) - Update manifest (JuliaGPU#402) (@github-actions[bot]) - Latency improvements (JuliaGPU#403) (@maleadt) - Fix bounds checking with GPU views. (JuliaGPU#404) (@maleadt) - Force specialization for dynamic_cudacall to support more arguments. (JuliaGPU#407) (@maleadt) - Fix some wrong pointer types in the CUDNN headers. (JuliaGPU#408) (@maleadt) - Refactor CUSPARSE (JuliaGPU#409) (@maleadt) - Fix typo (JuliaGPU#411) (@yixingfu) - Update manifest (JuliaGPU#413) (@github-actions[bot]) - Simplify library wrappers by introducing a CUDA Ref (JuliaGPU#414) (@maleadt) - Simplify and update wrappers (JuliaGPU#416) (@maleadt) - GEMM improvements (JuliaGPU#417) (@maleadt) - CompatHelper: add new compat entry for "BFloat16s" at version "0.1" (JuliaGPU#418) (@github-actions[bot]) - add CuSparseMatrixCOO (JuliaGPU#421) (@marius311) - Update manifest (JuliaGPU#423) (@github-actions[bot]) - Global math mode for easy use of lower-precision functionality (JuliaGPU#424) (@maleadt) - Improve init error message (JuliaGPU#425) (@maleadt) - CUBLAS: wrap rot! to implement rotate! and reflect! (JuliaGPU#427) (@maleadt) - CUFFT-related optimizations (JuliaGPU#428) (@maleadt) - Fix reverse/view regression (JuliaGPU#429) (@maleadt) - Update packages (JuliaGPU#433) (@maleadt) - Introduce StridedCuArray (JuliaGPU#435) (@maleadt) - Retry curandGenerateSeeds when OOM. (JuliaGPU#436) (@maleadt) - Introduce DenseCuArray union (JuliaGPU#437) (@maleadt) - Array simplifications (JuliaGPU#438) (@maleadt) - Fix and test reverse on wrapped array. (JuliaGPU#439) (@maleadt) - Fixes after recent array wrapper changes (JuliaGPU#441) (@maleadt) - Adapt to GPUArrays changes. (JuliaGPU#442) (@maleadt) - Provide CUBLAS with a pool-backed workspace. (JuliaGPU#443) (@maleadt) - Fix finalization of copied arrays. (JuliaGPU#444) (@maleadt) - Support for/Add CUDA 11.1 (JuliaGPU#445) (@maleadt) - Update manifest (JuliaGPU#449) (@github-actions[bot]) - Allow use of strided vectors with mul! (gemv! and gemm!) (JuliaGPU#450) (@maleadt) - Have convert call CuSparseArray's constructors. (JuliaGPU#451) (@maleadt)
## CUDA v1.3.3 [Diff since v1.3.2](JuliaGPU/CUDA.jl@v1.3.2...v1.3.3) **Closed issues:** - Type changing Array conversions give error when allowscalar(false) (JuliaGPU#344) - getindex(::CuArray, ::Adjoint, ::Colon) fails (JuliaGPU#345) - View with array indices causes memory copy before broadcast (JuliaGPU#384) - Regression with Julia 1.5 (JuliaGPU#390) **Merged pull requests:** - Replace DevicePtr with Core.LLVMPtr. (JuliaGPU#199) (@maleadt) - Make sure view indices reside on the GPU too. (JuliaGPU#388) (@maleadt) - CompatHelper: Update DataStructures to v0.18 (JuliaGPU#389) (@ChrisRackauckas)
## CUDA v1.3.2 [Diff since v1.3.1](JuliaGPU/CUDA.jl@v1.3.1...v1.3.2) **Closed issues:** - LLVM WMMA errors (JuliaGPU#380) **Merged pull requests:** - Fix handling of tests to skip. (JuliaGPU#386) (@maleadt) - Update manifest (JuliaGPU#387) (@github-actions[bot])
## CUDA v1.3.1 [Diff since v1.3.0](JuliaGPU/CUDA.jl@v1.3.0...v1.3.1) **Closed issues:** - Element-wise conversion fails (JuliaGPU#378) - atomic_min fails for Int32 in global CuDeviceArrays (JuliaGPU#379) - Segmentation fault from @cuprint on char (JuliaGPU#381) - error in versioninfo(), name not defined (JuliaGPU#385) **Merged pull requests:** - Fix docs (JuliaGPU#330) (@maleadt) - Wrap cusparseSpMV (JuliaGPU#351) (@marius311) - specify Cchar rather than char in the doc for @cuprint (JuliaGPU#382) (@MasonProtter) - Adapt to LLVM.jl changes for stateless codegen. (JuliaGPU#383) (@maleadt)
## CUDA v1.3.0 [Diff since v1.2.1](JuliaGPU/CUDA.jl@v1.2.1...v1.3.0) **Closed issues:** - Trouble with the @. macro (JuliaGPU#346) - NVMLError: Not Supported (code 3) (JuliaGPU#348) - Nvidia Xavier devices: exception thrown during kernel execution on device Xavier (JuliaGPU#349) - Could not load CUTENSOR artifact dll on Windows 10 (JuliaGPU#355) - CuTextureArray for 3D array (JuliaGPU#357) - Bug in julia 1.5.0 I have CUDA 11.0 installed in Ubuntu 18.04 (JuliaGPU#360) - Callback-based logging (JuliaGPU#366) - Artifact download timeout (JuliaGPU#369) - `sum!` accumulates when called multiple times (JuliaGPU#370) - nvprof does not detect kernel launches (JuliaGPU#371) - KernelError: passing and using non-bitstype argument (JuliaGPU#372) - CUDA.jl fails to find libcudadevrt.a due on a cluster install with multi-arch target (JuliaGPU#376) **Merged pull requests:** - Make the memory allocator context-aware (JuliaGPU#253) (@maleadt) - Update manifest (JuliaGPU#347) (@github-actions[bot]) - Guard against unsupported NVML usage in the test runner. (JuliaGPU#352) (@maleadt) - Bump CUDNN to v8.0.2 (JuliaGPU#353) (@maleadt) - Rework thread state management (JuliaGPU#356) (@maleadt) - Update manifest (JuliaGPU#358) (@github-actions[bot]) - Memory allocator simplifications (JuliaGPU#361) (@maleadt) - Deduplicate code from memory pools (JuliaGPU#362) (@maleadt) - Fix show of ArrayBuffer. (JuliaGPU#363) (@maleadt) - Clean-up the Buffer interface. (JuliaGPU#364) (@maleadt) - Use callback APIs to get library debug logs. (JuliaGPU#367) (@maleadt) - Allow selecting the memcheck tool. (JuliaGPU#368) (@maleadt) - Update GPUArrays. (JuliaGPU#373) (@maleadt) - Update to CUDA 11.0 update 1 (JuliaGPU#374) (@maleadt) - Number and iterate devices in versioninfo() following CUDA. (JuliaGPU#375) (@maleadt) - Reinstate support for Julia 1.3 (JuliaGPU#377) (@maleadt)
PreviousNext