Tags · frapac/CUDA.jl

v3.11.0

## CUDA v3.11.0

[Diff since v3.10.1](JuliaGPU/CUDA.jl@v3.10.1...v3.11.0)


**Closed issues:**
- CUSPARSE: Diagonal + CSC/CSR gives dense array (JuliaGPU#1469)
- CUBLAS: Multiplication of `UpperTriangular`/`LowerTriangular` not supported (JuliaGPU#1486)
- CUTENSOR tests consume lots of memory, breaking other tests (JuliaGPU#1501)
- CUFFT doesn't work for ComplexF64 C2C in-place (JuliaGPU#1519)
- Inconsistency of `==` and `isequal` for `CuArray` (JuliaGPU#1524)
- Setting CUDA seed the first time changes Random's RNG non-deterministically (JuliaGPU#1526)
- Undefined exported symbols (JuliaGPU#1527)
- Could not load library libLLVMExtra-14.dll (JuliaGPU#1535)
- Add an `rrule` for `cholesky` to `CUDA.jl` (JuliaGPU#1541)

**Merged pull requests:**
- specialize +/- op for sparse diag (JuliaGPU#1514) (@Roger-luo)
- Make sure instantiating RNGs doesn't affect the global CPU RNG. (JuliaGPU#1530) (@maleadt)
- Update manifest (JuliaGPU#1531) (@github-actions[bot])
- `ldiv!` for LU Decomposition (JuliaGPU#1532) (@SBuercklin)
- Lower dmax for contraction tests (JuliaGPU#1534) (@kshyatt)
- Fix convolution algorithm search (JuliaGPU#1536) (@maxfreu)
- Update manifest (JuliaGPU#1537) (@github-actions[bot])
- add specializations for some triangular-triangular multiplications (JuliaGPU#1538) (@Red-Portal)
- Add a utility to download artifacts without a functional driver. (JuliaGPU#1539) (@maleadt)
- Update manifest (JuliaGPU#1543) (@github-actions[bot])
- Explicit tests for type conversion (JuliaGPU#1544) (@kshyatt)
- Remove unused exports. (JuliaGPU#1545) (@maleadt)

Jun 15, 2022
15a0e1d
zip
tar.gz

v3.10.1

## CUDA v3.10.1

[Diff since v3.10.0](JuliaGPU/CUDA.jl@v3.10.0...v3.10.1)


**Closed issues:**
- Overflow in `randn` using CUDA.jl's native RNG (JuliaGPU#1464)
- Segmentation fault with pre-compiled library importing CUDA (JuliaGPU#1465)
- Julia freezes when using Polynomials with CuArray (JuliaGPU#1497)
- Launch overhead regression (JuliaGPU#1503)
- CUSOLVER: Matrix division requires identical types (JuliaGPU#1512)
- Incorrect distribution for complex standard normals when using `CUDA.default_rng()` (JuliaGPU#1515)
- loggamma (JuliaGPU#1528)

**Merged pull requests:**
- CUSPARSE: Support mixed type mv (JuliaGPU#1475) (@Roger-luo)
- Add method for LinearAlgebra.opnorm2 (JuliaGPU#1516) (@danielwe)
- Promote to common eltype in matrix division (JuliaGPU#1517) (@danielwe)
- Fix Box-Muller transformation for complex eltypes (JuliaGPU#1518) (@danielwe)
- Update manifest (JuliaGPU#1521) (@github-actions[bot])
- Use at-dispose for LLVM.jl resource cleanup. (JuliaGPU#1523) (@maleadt)
- loggamma (JuliaGPU#1529) (@cossio)

May 27, 2022
49902d8
zip
tar.gz

v3.10.0

## CUDA v3.10.0

[Diff since v3.9.1](JuliaGPU/CUDA.jl@v3.9.1...v3.10.0)


**Closed issues:**
- `Error while freeing DeviceBuffer`-warning when using multiple GPUs (JuliaGPU#1454)
- CUDNN cache locking prevents finalizers resulting in OOMs (JuliaGPU#1461)
- EOFError from pool_cleanup when closing REPL (JuliaGPU#1495)
- TypeError in compiler with custom kernel (JuliaGPU#1496)

**Merged pull requests:**
- expose sparse mv/mm algo selection (JuliaGPU#1201) (@Roger-luo)
- Always inspect the task-local context when verifying before freeing. (JuliaGPU#1462) (@maleadt)
- support sparse opnorm (JuliaGPU#1466) (@Roger-luo)
- Move CUSTATEVEC and CUTENSORNET into lib/ (JuliaGPU#1478) (@vchuravy)
- Adapt to GPUCompiler 0.15 changes (JuliaGPU#1488) (@maleadt)
- Limit time held by CUDNN locks. (JuliaGPU#1491) (@maleadt)
- Docstring for `cu` (JuliaGPU#1493) (@mcabbott)
- Update manifest (JuliaGPU#1499) (@github-actions[bot])
- Silence EOFError in pool_cleanup (JuliaGPU#1502) (@Octogonapus)
- Adapt to GPUCompiler changes (JuliaGPU#1504) (@maleadt)
- Fixes for CUSPARSE 11.7.1. (JuliaGPU#1505) (@maleadt)
- Update artifacts (JuliaGPU#1507) (@maleadt)
- Update manifest (JuliaGPU#1509) (@github-actions[bot])
- Add a new cache for HostKernel objects. (JuliaGPU#1510) (@maleadt)

May 16, 2022
044bd98
zip
tar.gz

v3.9.1

## CUDA v3.9.1

[Diff since v3.9.0](JuliaGPU/CUDA.jl@v3.9.0...v3.9.1)


**Closed issues:**
- Issue with copy_cublasfloat (JuliaGPU#1476)
- Errors when broadcasting random number generators (JuliaGPU#1480)
- CPU version of linear algebra routine is dispatched when using `Zygote.gradient` (JuliaGPU#1481)
- `scan!` fails on vectors of structs (JuliaGPU#1482)
- InexactError when getting CUDA version info (JuliaGPU#1489)

**Merged pull requests:**
- Allow more integer argument types for byte_perm (JuliaGPU#1420) (@eschnett)
- support CuSparseMatrix(::Diagonal) (JuliaGPU#1470) (@Roger-luo)
- Don't emit debug info until the next CUDA version. (JuliaGPU#1473) (@maleadt)
- Update manifest (JuliaGPU#1474) (@github-actions[bot])
- Update manifest (JuliaGPU#1479) (@github-actions[bot])
- fix unsafe_wrap docstring and widen signature (JuliaGPU#1483) (@piever)
- Update manifest (JuliaGPU#1484) (@github-actions[bot])
- Check whether cudaRuntimeGetVersion succeeded. (JuliaGPU#1490) (@maleadt)
- Update manifest (JuliaGPU#1494) (@github-actions[bot])
- Fix JuliaGPU#1476: Allow any container in copy_cublasfloat (JuliaGPU#1498) (@danielwe)

May 8, 2022
b7e60f5
zip
tar.gz

v3.9.0

## CUDA v3.9.0

[Diff since v3.8.5](JuliaGPU/CUDA.jl@v3.8.5...v3.9.0)


**Closed issues:**
- Tests for showing (JuliaGPU#35)
- Support LU factorizations (JuliaGPU#1193)
- Int8 WMMA not working in 3.8.4 and 3.8.5 despite merged PR. Add more unit tests? (JuliaGPU#1442)
- Optional CPU cpu kernel call with @cuda  (JuliaGPU#1443)
- Add library/artifact management for NCCL (JuliaGPU#1446)
- permutedims returns a lowertriangular matrix (JuliaGPU#1451)
- New broadcast corrupts memory? (JuliaGPU#1457)
- norm does not dispatch on CuSparseMatrixCSC  (JuliaGPU#1460)
- scalar * sparse multiplication (JuliaGPU#1468)

**Merged pull requests:**
- CUTENSOR: axpy! and axpby! not mutating fixed (JuliaGPU#1416) (@yapanuwan)
- Initial wrap of cuquantum (JuliaGPU#1437) (@kshyatt)
- CompatHelper: bump compat for "GPUCompiler" to "0.14" (JuliaGPU#1441) (@github-actions[bot])
- Fix return type of nrm2 for ComplexF16 (JuliaGPU#1444) (@danielwe)
- Use a build matrix. (JuliaGPU#1445) (@maleadt)
- Update manifest (JuliaGPU#1447) (@github-actions[bot])
- Rework factorizations (JuliaGPU#1449) (@maleadt)
- Add NCCL binaries. (JuliaGPU#1450) (@maleadt)
- Support general eltypes in matrix division and SVD (JuliaGPU#1453) (@danielwe)
- Update manifest (JuliaGPU#1456) (@github-actions[bot])
- Look at more environment variables to find nsys. (JuliaGPU#1459) (@maleadt)
- Fixes for 1.8 (JuliaGPU#1463) (@maleadt)

Apr 9, 2022
5c40438
zip
tar.gz

v3.8.5

## CUDA v3.8.5

[Diff since v3.8.4](JuliaGPU/CUDA.jl@v3.8.4...v3.8.5)



**Merged pull requests:**
- Update manifest (JuliaGPU#1440) (@github-actions[bot])

Mar 14, 2022
a57345a
zip
tar.gz

v3.8.4

## CUDA v3.8.4

[Diff since v3.8.3](JuliaGPU/CUDA.jl@v3.8.3...v3.8.4)


**Closed issues:**
- sparse-sparse and sparse-constant multiplication lose sparsity (output dense matrix) (JuliaGPU#1264)
- LLVMExtra fails to load on Julia 1.8 and PPC (JuliaGPU#1387)
- compute-sanitizer CUDA_ERROR_INVALID_VALUE on CUDA.jl 3.0+ (JuliaGPU#1415)
- `@cudnnDescriptor` is not threadsafe (JuliaGPU#1421)
- Precomplication of CUDA 3.8.3 broken on 1.7.1 due to changes in Random123.jl (JuliaGPU#1422)
- OOM error should include memory status (JuliaGPU#1427)
- WMMA kernel works with Julia 1.7.2 but fails with `illegal memory access` for Julia 1.8.0-beta1 (JuliaGPU#1431)
- Non Int64 local memory size leads to dynamic function invocation (JuliaGPU#1434)
- "initialization" test failing (JuliaGPU#1435)
- cuda with julia 1.8 not working on windows (working fine(?) on wsl2) (JuliaGPU#1436)

**Merged pull requests:**
- Add Int8 WMMA Support (JuliaGPU#1119) (@max-Hawkins)
- Wrap generic sparse-sparse GEMM (JuliaGPU#1285) (@kshyatt)
- Fix sparse COO to CSR conversion. (JuliaGPU#1412) (@maleadt)
- Drop support for CUDA 10.1 and below (JuliaGPU#1414) (@maleadt)
- Update manifest (JuliaGPU#1417) (@github-actions[bot])
- Report the OOM memory status at the time of the error. (JuliaGPU#1428) (@maleadt)
- Lock CUDNN descriptor cache lookups. (JuliaGPU#1430) (@maleadt)
- Switch to new LLVM context management for 1.9 compatibility. (JuliaGPU#1432) (@maleadt)
- Update manifest (JuliaGPU#1433) (@github-actions[bot])
- Backports for 3.8.4 (JuliaGPU#1438) (@maleadt)

Mar 11, 2022
1526aad
zip
tar.gz

v3.8.3

## CUDA v3.8.3

[Diff since v3.8.2](JuliaGPU/CUDA.jl@v3.8.2...v3.8.3)


**Closed issues:**
- Sparse matrix addition not working (JuliaGPU#528)
- Native implementation of sparse arrays (JuliaGPU#829)
- CUSPARSE: Adding a value to the diagonal (JuliaGPU#1372)
- Conversion by `cu` casts Float64 to Float32 but not Int64 to Int32 (JuliaGPU#1388)
- `CUDA.math_mode!(...; precision)` option not working (JuliaGPU#1392)
- `cuIpcGetMemHandle` failure resulting in CUDA-aware MPI to fail (JuliaGPU#1398)
- axpby! support for BFloat16 (JuliaGPU#1399)
- CUSPARSE does not support integer matrices, breaks printing (JuliaGPU#1402)
- `sparse(I, J, V)` doesn't support unsorted inputs (JuliaGPU#1407)

**Merged pull requests:**
- General purpose broadcast for sparse CSR matrices. (JuliaGPU#1380) (@maleadt)
- Update manifest (JuliaGPU#1389) (@github-actions[bot])
- Implement sparse operations with UniformScaling using broadcast. (JuliaGPU#1390) (@maleadt)
- Prevent toplevel compilation. (JuliaGPU#1391) (@maleadt)
- Fix and test math precision. (JuliaGPU#1394) (@maleadt)
- Bump artifacts (JuliaGPU#1397) (@maleadt)
- support BFloat16 for atomic_cas (JuliaGPU#1400) (@bjarthur)
- Implement sparse broadcasting with CSC matrices. (JuliaGPU#1401) (@maleadt)
- Always report issues with discovering CUDA. (JuliaGPU#1404) (@maleadt)
- Fix sparse 1-argument broadcast output type. (JuliaGPU#1405) (@maleadt)
- CUSPARSE BSR improvements (JuliaGPU#1409) (@maleadt)
- Support limited sparse integer arrays by bitcasting to floating point. (JuliaGPU#1410) (@maleadt)
- Support using sparse with unsorted inputs. (JuliaGPU#1411) (@maleadt)
- Backports for 3.8.3 (JuliaGPU#1413) (@maleadt)

Feb 25, 2022
2319b89
zip
tar.gz

v3.8.2

## CUDA v3.8.2

[Diff since v3.8.1](JuliaGPU/CUDA.jl@v3.8.1...v3.8.2)


**Closed issues:**
- CuSparseMatrixCSC missing lu and interactions with UniformScaling (JuliaGPU#79)
- CUSPARSE typo (JuliaGPU#1231)
- similar(A::CuSparse,eltype) returns an Array (JuliaGPU#1316)
- "errormonitor" undefined in julia1.6 (JuliaGPU#1375)
- Pool free can switch tasks (JuliaGPU#1384)

**Merged pull requests:**
- Define a compatibility shim for errormonitor (JuliaGPU#1378) (@vchuravy)
- Backport JuliaGPU#1361 to 3.8 (JuliaGPU#1379) (@vchuravy)
- Backports for 3.8.2 (JuliaGPU#1381) (@maleadt)
- Remove broken errormonitor implementation, just don't use it on 1.6. (JuliaGPU#1382) (@maleadt)
- Memory pool improvements (JuliaGPU#1383) (@maleadt)

Feb 18, 2022
46db50d
zip
tar.gz

v3.8.1

## CUDA v3.8.1

[Diff since v3.8.0](JuliaGPU/CUDA.jl@v3.8.0...v3.8.1)


**Closed issues:**
- `one(::CuMatrix)` result on cpu (JuliaGPU#142)
- Broadcasted setindex! triggers scalar setindex! (JuliaGPU#101)
- OutOfGPUMemoryError With Available Memory (JuliaGPU#1346)
- Distributions.jl with CuArrays (JuliaGPU#1347)
- Views of Flux OneHotArrays (JuliaGPU#1349)
- synchronize(blocking = false) hangs in julia 1.7 eventually (JuliaGPU#1350)
- unsupported call through a literal pointer (call to log1pf) on Julia 1.6.5 (JuliaGPU#1352)
- SpecialFunctions ^1.8 compat entry? (JuliaGPU#1354)
- Performance deprecation using `^` on Float32 (JuliaGPU#1358)
- Method definition setindex!(LinearAlgebra.Diagonal{T, V} ... overwritten in module CUDA (JuliaGPU#1364)
- [PackageCompiler] Segmentation fault with CUDA.jl in multiversioning  (JuliaGPU#1365)
- Vectors in customary structs make julia stuck (JuliaGPU#1366)
- sparseCSC-dense matrix multiplication yields unstable results (JuliaGPU#1368)
- UndefVarError: parameters not defined on Windows10 (JuliaGPU#1371)

**Merged pull requests:**
- Optimize memoization helpers. (JuliaGPU#1345) (@maleadt)
- Update manifest (JuliaGPU#1348) (@github-actions[bot])
- Update manifest (JuliaGPU#1355) (@github-actions[bot])
- Fastmath improvements (JuliaGPU#1356) (@maleadt)
- Make the default pool visible when doing P2P (JuliaGPU#1357) (@maleadt)
- Fix resize of empty arrays. (JuliaGPU#1359) (@maleadt)
- CUSPARSE: add COO ctors and similar with eltype. (JuliaGPU#1360) (@maleadt)
- Add device_override for SpecialFunctions.gamma (JuliaGPU#1361) (@vchuravy)
- Implement (limited) broadcast of sparse arrays (JuliaGPU#1367) (@maleadt)
- Make nonblocking synchronization robust to errors. (JuliaGPU#1369) (@maleadt)
- Update manifest (JuliaGPU#1370) (@github-actions[bot])
- Backports for 3.8.1 (JuliaGPU#1374) (@maleadt)

Feb 15, 2022
9d04926
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v3.11.0

v3.10.1

v3.10.0

v3.9.1

v3.9.0

v3.8.5

v3.8.4

v3.8.3

v3.8.2

v3.8.1

Tags: frapac/CUDA.jl