Tags: frapac/CUDA.jl
Tags
## CUDA v3.11.0 [Diff since v3.10.1](JuliaGPU/CUDA.jl@v3.10.1...v3.11.0) **Closed issues:** - CUSPARSE: Diagonal + CSC/CSR gives dense array (JuliaGPU#1469) - CUBLAS: Multiplication of `UpperTriangular`/`LowerTriangular` not supported (JuliaGPU#1486) - CUTENSOR tests consume lots of memory, breaking other tests (JuliaGPU#1501) - CUFFT doesn't work for ComplexF64 C2C in-place (JuliaGPU#1519) - Inconsistency of `==` and `isequal` for `CuArray` (JuliaGPU#1524) - Setting CUDA seed the first time changes Random's RNG non-deterministically (JuliaGPU#1526) - Undefined exported symbols (JuliaGPU#1527) - Could not load library libLLVMExtra-14.dll (JuliaGPU#1535) - Add an `rrule` for `cholesky` to `CUDA.jl` (JuliaGPU#1541) **Merged pull requests:** - specialize +/- op for sparse diag (JuliaGPU#1514) (@Roger-luo) - Make sure instantiating RNGs doesn't affect the global CPU RNG. (JuliaGPU#1530) (@maleadt) - Update manifest (JuliaGPU#1531) (@github-actions[bot]) - `ldiv!` for LU Decomposition (JuliaGPU#1532) (@SBuercklin) - Lower dmax for contraction tests (JuliaGPU#1534) (@kshyatt) - Fix convolution algorithm search (JuliaGPU#1536) (@maxfreu) - Update manifest (JuliaGPU#1537) (@github-actions[bot]) - add specializations for some triangular-triangular multiplications (JuliaGPU#1538) (@Red-Portal) - Add a utility to download artifacts without a functional driver. (JuliaGPU#1539) (@maleadt) - Update manifest (JuliaGPU#1543) (@github-actions[bot]) - Explicit tests for type conversion (JuliaGPU#1544) (@kshyatt) - Remove unused exports. (JuliaGPU#1545) (@maleadt)
## CUDA v3.10.1 [Diff since v3.10.0](JuliaGPU/CUDA.jl@v3.10.0...v3.10.1) **Closed issues:** - Overflow in `randn` using CUDA.jl's native RNG (JuliaGPU#1464) - Segmentation fault with pre-compiled library importing CUDA (JuliaGPU#1465) - Julia freezes when using Polynomials with CuArray (JuliaGPU#1497) - Launch overhead regression (JuliaGPU#1503) - CUSOLVER: Matrix division requires identical types (JuliaGPU#1512) - Incorrect distribution for complex standard normals when using `CUDA.default_rng()` (JuliaGPU#1515) - loggamma (JuliaGPU#1528) **Merged pull requests:** - CUSPARSE: Support mixed type mv (JuliaGPU#1475) (@Roger-luo) - Add method for LinearAlgebra.opnorm2 (JuliaGPU#1516) (@danielwe) - Promote to common eltype in matrix division (JuliaGPU#1517) (@danielwe) - Fix Box-Muller transformation for complex eltypes (JuliaGPU#1518) (@danielwe) - Update manifest (JuliaGPU#1521) (@github-actions[bot]) - Use at-dispose for LLVM.jl resource cleanup. (JuliaGPU#1523) (@maleadt) - loggamma (JuliaGPU#1529) (@cossio)
## CUDA v3.10.0 [Diff since v3.9.1](JuliaGPU/CUDA.jl@v3.9.1...v3.10.0) **Closed issues:** - `Error while freeing DeviceBuffer`-warning when using multiple GPUs (JuliaGPU#1454) - CUDNN cache locking prevents finalizers resulting in OOMs (JuliaGPU#1461) - EOFError from pool_cleanup when closing REPL (JuliaGPU#1495) - TypeError in compiler with custom kernel (JuliaGPU#1496) **Merged pull requests:** - expose sparse mv/mm algo selection (JuliaGPU#1201) (@Roger-luo) - Always inspect the task-local context when verifying before freeing. (JuliaGPU#1462) (@maleadt) - support sparse opnorm (JuliaGPU#1466) (@Roger-luo) - Move CUSTATEVEC and CUTENSORNET into lib/ (JuliaGPU#1478) (@vchuravy) - Adapt to GPUCompiler 0.15 changes (JuliaGPU#1488) (@maleadt) - Limit time held by CUDNN locks. (JuliaGPU#1491) (@maleadt) - Docstring for `cu` (JuliaGPU#1493) (@mcabbott) - Update manifest (JuliaGPU#1499) (@github-actions[bot]) - Silence EOFError in pool_cleanup (JuliaGPU#1502) (@Octogonapus) - Adapt to GPUCompiler changes (JuliaGPU#1504) (@maleadt) - Fixes for CUSPARSE 11.7.1. (JuliaGPU#1505) (@maleadt) - Update artifacts (JuliaGPU#1507) (@maleadt) - Update manifest (JuliaGPU#1509) (@github-actions[bot]) - Add a new cache for HostKernel objects. (JuliaGPU#1510) (@maleadt)
## CUDA v3.9.1 [Diff since v3.9.0](JuliaGPU/CUDA.jl@v3.9.0...v3.9.1) **Closed issues:** - Issue with copy_cublasfloat (JuliaGPU#1476) - Errors when broadcasting random number generators (JuliaGPU#1480) - CPU version of linear algebra routine is dispatched when using `Zygote.gradient` (JuliaGPU#1481) - `scan!` fails on vectors of structs (JuliaGPU#1482) - InexactError when getting CUDA version info (JuliaGPU#1489) **Merged pull requests:** - Allow more integer argument types for byte_perm (JuliaGPU#1420) (@eschnett) - support CuSparseMatrix(::Diagonal) (JuliaGPU#1470) (@Roger-luo) - Don't emit debug info until the next CUDA version. (JuliaGPU#1473) (@maleadt) - Update manifest (JuliaGPU#1474) (@github-actions[bot]) - Update manifest (JuliaGPU#1479) (@github-actions[bot]) - fix unsafe_wrap docstring and widen signature (JuliaGPU#1483) (@piever) - Update manifest (JuliaGPU#1484) (@github-actions[bot]) - Check whether cudaRuntimeGetVersion succeeded. (JuliaGPU#1490) (@maleadt) - Update manifest (JuliaGPU#1494) (@github-actions[bot]) - Fix JuliaGPU#1476: Allow any container in copy_cublasfloat (JuliaGPU#1498) (@danielwe)
## CUDA v3.9.0 [Diff since v3.8.5](JuliaGPU/CUDA.jl@v3.8.5...v3.9.0) **Closed issues:** - Tests for showing (JuliaGPU#35) - Support LU factorizations (JuliaGPU#1193) - Int8 WMMA not working in 3.8.4 and 3.8.5 despite merged PR. Add more unit tests? (JuliaGPU#1442) - Optional CPU cpu kernel call with @cuda (JuliaGPU#1443) - Add library/artifact management for NCCL (JuliaGPU#1446) - permutedims returns a lowertriangular matrix (JuliaGPU#1451) - New broadcast corrupts memory? (JuliaGPU#1457) - norm does not dispatch on CuSparseMatrixCSC (JuliaGPU#1460) - scalar * sparse multiplication (JuliaGPU#1468) **Merged pull requests:** - CUTENSOR: axpy! and axpby! not mutating fixed (JuliaGPU#1416) (@yapanuwan) - Initial wrap of cuquantum (JuliaGPU#1437) (@kshyatt) - CompatHelper: bump compat for "GPUCompiler" to "0.14" (JuliaGPU#1441) (@github-actions[bot]) - Fix return type of nrm2 for ComplexF16 (JuliaGPU#1444) (@danielwe) - Use a build matrix. (JuliaGPU#1445) (@maleadt) - Update manifest (JuliaGPU#1447) (@github-actions[bot]) - Rework factorizations (JuliaGPU#1449) (@maleadt) - Add NCCL binaries. (JuliaGPU#1450) (@maleadt) - Support general eltypes in matrix division and SVD (JuliaGPU#1453) (@danielwe) - Update manifest (JuliaGPU#1456) (@github-actions[bot]) - Look at more environment variables to find nsys. (JuliaGPU#1459) (@maleadt) - Fixes for 1.8 (JuliaGPU#1463) (@maleadt)
## CUDA v3.8.5 [Diff since v3.8.4](JuliaGPU/CUDA.jl@v3.8.4...v3.8.5) **Merged pull requests:** - Update manifest (JuliaGPU#1440) (@github-actions[bot])
## CUDA v3.8.4 [Diff since v3.8.3](JuliaGPU/CUDA.jl@v3.8.3...v3.8.4) **Closed issues:** - sparse-sparse and sparse-constant multiplication lose sparsity (output dense matrix) (JuliaGPU#1264) - LLVMExtra fails to load on Julia 1.8 and PPC (JuliaGPU#1387) - compute-sanitizer CUDA_ERROR_INVALID_VALUE on CUDA.jl 3.0+ (JuliaGPU#1415) - `@cudnnDescriptor` is not threadsafe (JuliaGPU#1421) - Precomplication of CUDA 3.8.3 broken on 1.7.1 due to changes in Random123.jl (JuliaGPU#1422) - OOM error should include memory status (JuliaGPU#1427) - WMMA kernel works with Julia 1.7.2 but fails with `illegal memory access` for Julia 1.8.0-beta1 (JuliaGPU#1431) - Non Int64 local memory size leads to dynamic function invocation (JuliaGPU#1434) - "initialization" test failing (JuliaGPU#1435) - cuda with julia 1.8 not working on windows (working fine(?) on wsl2) (JuliaGPU#1436) **Merged pull requests:** - Add Int8 WMMA Support (JuliaGPU#1119) (@max-Hawkins) - Wrap generic sparse-sparse GEMM (JuliaGPU#1285) (@kshyatt) - Fix sparse COO to CSR conversion. (JuliaGPU#1412) (@maleadt) - Drop support for CUDA 10.1 and below (JuliaGPU#1414) (@maleadt) - Update manifest (JuliaGPU#1417) (@github-actions[bot]) - Report the OOM memory status at the time of the error. (JuliaGPU#1428) (@maleadt) - Lock CUDNN descriptor cache lookups. (JuliaGPU#1430) (@maleadt) - Switch to new LLVM context management for 1.9 compatibility. (JuliaGPU#1432) (@maleadt) - Update manifest (JuliaGPU#1433) (@github-actions[bot]) - Backports for 3.8.4 (JuliaGPU#1438) (@maleadt)
## CUDA v3.8.3 [Diff since v3.8.2](JuliaGPU/CUDA.jl@v3.8.2...v3.8.3) **Closed issues:** - Sparse matrix addition not working (JuliaGPU#528) - Native implementation of sparse arrays (JuliaGPU#829) - CUSPARSE: Adding a value to the diagonal (JuliaGPU#1372) - Conversion by `cu` casts Float64 to Float32 but not Int64 to Int32 (JuliaGPU#1388) - `CUDA.math_mode!(...; precision)` option not working (JuliaGPU#1392) - `cuIpcGetMemHandle` failure resulting in CUDA-aware MPI to fail (JuliaGPU#1398) - axpby! support for BFloat16 (JuliaGPU#1399) - CUSPARSE does not support integer matrices, breaks printing (JuliaGPU#1402) - `sparse(I, J, V)` doesn't support unsorted inputs (JuliaGPU#1407) **Merged pull requests:** - General purpose broadcast for sparse CSR matrices. (JuliaGPU#1380) (@maleadt) - Update manifest (JuliaGPU#1389) (@github-actions[bot]) - Implement sparse operations with UniformScaling using broadcast. (JuliaGPU#1390) (@maleadt) - Prevent toplevel compilation. (JuliaGPU#1391) (@maleadt) - Fix and test math precision. (JuliaGPU#1394) (@maleadt) - Bump artifacts (JuliaGPU#1397) (@maleadt) - support BFloat16 for atomic_cas (JuliaGPU#1400) (@bjarthur) - Implement sparse broadcasting with CSC matrices. (JuliaGPU#1401) (@maleadt) - Always report issues with discovering CUDA. (JuliaGPU#1404) (@maleadt) - Fix sparse 1-argument broadcast output type. (JuliaGPU#1405) (@maleadt) - CUSPARSE BSR improvements (JuliaGPU#1409) (@maleadt) - Support limited sparse integer arrays by bitcasting to floating point. (JuliaGPU#1410) (@maleadt) - Support using sparse with unsorted inputs. (JuliaGPU#1411) (@maleadt) - Backports for 3.8.3 (JuliaGPU#1413) (@maleadt)
## CUDA v3.8.2 [Diff since v3.8.1](JuliaGPU/CUDA.jl@v3.8.1...v3.8.2) **Closed issues:** - CuSparseMatrixCSC missing lu and interactions with UniformScaling (JuliaGPU#79) - CUSPARSE typo (JuliaGPU#1231) - similar(A::CuSparse,eltype) returns an Array (JuliaGPU#1316) - "errormonitor" undefined in julia1.6 (JuliaGPU#1375) - Pool free can switch tasks (JuliaGPU#1384) **Merged pull requests:** - Define a compatibility shim for errormonitor (JuliaGPU#1378) (@vchuravy) - Backport JuliaGPU#1361 to 3.8 (JuliaGPU#1379) (@vchuravy) - Backports for 3.8.2 (JuliaGPU#1381) (@maleadt) - Remove broken errormonitor implementation, just don't use it on 1.6. (JuliaGPU#1382) (@maleadt) - Memory pool improvements (JuliaGPU#1383) (@maleadt)
## CUDA v3.8.1 [Diff since v3.8.0](JuliaGPU/CUDA.jl@v3.8.0...v3.8.1) **Closed issues:** - `one(::CuMatrix)` result on cpu (JuliaGPU#142) - Broadcasted setindex! triggers scalar setindex! (JuliaGPU#101) - OutOfGPUMemoryError With Available Memory (JuliaGPU#1346) - Distributions.jl with CuArrays (JuliaGPU#1347) - Views of Flux OneHotArrays (JuliaGPU#1349) - synchronize(blocking = false) hangs in julia 1.7 eventually (JuliaGPU#1350) - unsupported call through a literal pointer (call to log1pf) on Julia 1.6.5 (JuliaGPU#1352) - SpecialFunctions ^1.8 compat entry? (JuliaGPU#1354) - Performance deprecation using `^` on Float32 (JuliaGPU#1358) - Method definition setindex!(LinearAlgebra.Diagonal{T, V} ... overwritten in module CUDA (JuliaGPU#1364) - [PackageCompiler] Segmentation fault with CUDA.jl in multiversioning (JuliaGPU#1365) - Vectors in customary structs make julia stuck (JuliaGPU#1366) - sparseCSC-dense matrix multiplication yields unstable results (JuliaGPU#1368) - UndefVarError: parameters not defined on Windows10 (JuliaGPU#1371) **Merged pull requests:** - Optimize memoization helpers. (JuliaGPU#1345) (@maleadt) - Update manifest (JuliaGPU#1348) (@github-actions[bot]) - Update manifest (JuliaGPU#1355) (@github-actions[bot]) - Fastmath improvements (JuliaGPU#1356) (@maleadt) - Make the default pool visible when doing P2P (JuliaGPU#1357) (@maleadt) - Fix resize of empty arrays. (JuliaGPU#1359) (@maleadt) - CUSPARSE: add COO ctors and similar with eltype. (JuliaGPU#1360) (@maleadt) - Add device_override for SpecialFunctions.gamma (JuliaGPU#1361) (@vchuravy) - Implement (limited) broadcast of sparse arrays (JuliaGPU#1367) (@maleadt) - Make nonblocking synchronization robust to errors. (JuliaGPU#1369) (@maleadt) - Update manifest (JuliaGPU#1370) (@github-actions[bot]) - Backports for 3.8.1 (JuliaGPU#1374) (@maleadt)
PreviousNext