Please see https://github.com/rapidsai/raft/releases/tag/v21.12.00a for the latest changes to this development branch.
- Miscellaneous tech debts/cleanups (#286) @viclafargue
- Accounting for rmm::cuda_stream_pool not having a constructor for 0 streams (#329) @divyegala
- Fix wrong lda parameter in gemv (#327) @achirkin
- Fix
matrixVectorOp
to verify promoted pointer type is still aligned to vectorized load boundary (#325) @viclafargue - Pin rmm to branch-21.10 and remove warnings from kmeans.hpp (#322) @dantegd
- Temporarily pin RMM while refactor removes deprecated calls (#315) @dantegd
- Fix more warnings (#311) @harrism
- Add Hamming, Jensen-Shannon, KL-Divergence, Russell rao and Correlation distance metrics support (#306) @mdoijade
- Pin max
dask
anddistributed
versions to2021.09.1
(#334) @galipremsagar - Make sure we keep the rapids-cmake and raft cal version in sync (#331) @robertmaynard
- Add broadcast with const input iterator (#328) @seunghwak
- Fused L2 (unexpanded) kNN kernel for NN <= 64, without using temporary gmem to store intermediate distances (#324) @mdoijade
- Update with rapids cmake new features (#320) @robertmaynard
- Update to UCX-Py 0.22 (#319) @pentschev
- Fix Forward-Merge Conflicts (#318) @ajschmidt8
- Enable CUDA device code warnings as errors (#307) @harrism
- Remove max version pin for dask & distributed on development branch (#303) @galipremsagar
- Warnings are errors (#299) @harrism
- Use the new RAPIDS.cmake to fetch rapids-cmake (#298) @robertmaynard
- ENH Replace gpuci_conda_retry with gpuci_mamba_retry (#295) @dillon-cullinan
- Miscellaneous tech debts/cleanups (#286) @viclafargue
- Random Ball Cover Algorithm for 2D Haversine/Euclidean (#213) @cjnolet
- expose epsilon parameter to allow precision to to be specified (#275) @ChuckHastings
- Fix support for different input and output types in linalg::reduce (#296) @Nyrio
- Const raft handle in sparse bfknn (#280) @cjnolet
- Add
cuco::cuco
to list of linked libraries (#279) @trxcllnt - Use nested include in destination of install headers to avoid docker permission issues (#263) @dantegd
- Update UCX-Py version to 0.21 (#255) @pentschev
- Fix mst knn test build failure due to RMM device_buffer change (#253) @mdoijade
- Add chebyshev, canberra, minkowksi and hellinger distance metrics (#276) @mdoijade
- Move FAISS ANN wrappers to RAFT (#265) @cjnolet
- Remaining sparse semiring distances (#261) @cjnolet
- removing divye from codeowners (#257) @divyegala
- Pinning cuco to a specific commit hash for release (#304) @rlratzel
- Pin max
dask
&distributed
versions (#301) @galipremsagar - Overlap epilog compute with ldg of next grid stride in pairwise distance & fusedL2NN kernels (#292) @mdoijade
- Always add faiss library alias if it's missing (#287) @trxcllnt
- Use
NVIDIA/cuCollections
repo again (#284) @trxcllnt - Use the 21.08 branch of rapids-cmake as rmm requires it (#278) @robertmaynard
- expose epsilon parameter to allow precision to to be specified (#275) @ChuckHastings
- Fix
21.08
forward-merge conflicts (#274) @ajschmidt8 - Add lds and sts inline ptx instructions to force vector instruction generation (#273) @mdoijade
- Move ANN to RAFT (additional updates) (#270) @cjnolet
- Sparse semirings cleanup + hash table & batching strategies (#269) @divyegala
- Revert "pin dask versions in CI (#260)" (#264" (#264)) @ajschmidt8
- Pass stream to device_scalar::value() calls. (#259) @harrism
- Update get_rmm.cmake to better support CalVer (#258) @harrism
- Add Grid stride pairwise dist and fused L2 NN kernels (#250) @mdoijade
- Fix merge conflicts (#236) @ajschmidt8
- Update UCX-Py version to 0.20 (#254) @pentschev
- cuco git tag update (again) (#248) @seunghwak
- Revert PR #232 for 21.06 release (#246) @dantegd
- Python comms to hold onto server endpoints (#241) @cjnolet
- Fix Thrust 1.12 compile errors (#231) @trxcllnt
- Make sure we use CalVer when checking out rapids-cmake (#230) @robertmaynard
- Loss of Precision in MST weight alteration (#223) @divyegala
- cuco git tag update (#243) @seunghwak
- Update
CHANGELOG.md
links for calver (#233) @ajschmidt8 - Add Grid stride pairwise dist and fused L2 NN kernels (#232) @mdoijade
- Updates to enable HDBSCAN (#208) @cjnolet
- Exposing spectral random seed property (#193) @cjnolet
- Fix pointer arithmetic in spmv smem kernel (#183) @lowener
- Modify default value for rowMajorIndex and rowMajorQuery in bf-knn (#173) @viclafargue
- Remove setCudaMallocWarning() call for libfaiss[@v1.7.0 (#167) @trxcllnt](https://github.com/v1.7.0 (#167) @trxcllnt)
- Add const to KNN handle (#157) @hlinsen
- Fixing codeowners (#194) @cjnolet
- Adjust Hellinger pairwise distance to vaoid NaNs (#189) @lowener
- Add column major input support in contractions_nt kernels with new kernel policy for it (#188) @mdoijade
- Dice formula correction (#186) @lowener
- Scaling knn graph fix connectivities algorithm (#181) @cjnolet
- Fixing RAFT CI & a few small updates for SLHC Python wrapper (#178) @cjnolet
- Add Precomputed to the DistanceType enum (for cuML DBSCAN) (#177) @Nyrio
- Enable matrix::copyRows for row major input (#176) @tfeher
- Add Dice distance to distancetype enum (#174) @lowener
- Porting over recent updates to distance prim from cuml (#172) @cjnolet
- Update KNN (#171) @viclafargue
- Adding translations parameter to brute_force_knn (#170) @viclafargue
- Update Changelog Link (#169) @ajschmidt8
- Map operation (#168) @viclafargue
- Updating sparse prims based on recent changes (#166) @cjnolet
- Prepare Changelog for Automation (#164) @ajschmidt8
- Update 0.18 changelog entry (#163) @ajschmidt8
- MST symmetric/non-symmetric output for SLHC (#162) @divyegala
- Pass pre-computed colors to MST (#154) @divyegala
- Streams upgrade in RAFT handle (RMM backend + create handle from parent's pool) (#148) @afender
- Merge branch-0.18 into 0.19 (#146) @dantegd
- Add device_send, device_recv, device_sendrecv, device_multicast_sendrecv (#144) @seunghwak
- Adding SLHC prims. (#140) @cjnolet
- Moving cuml sparse prims to raft (#139) @cjnolet
- Make NCCL root initialization configurable. (#120) @drobison00
- Add idx_t template parameter to matrix helper routines (#131) @tfeher
- Eliminate CUDA 10.2 as valid for large svd solving (#129) @wphicks
- Update check to allow svd solver on CUDA>=10.2 (#125) @wphicks
- Updating gpu build.sh and debugging threads CI issue (#123) @dantegd
- Adding additional distances (#116) @cjnolet
- Update stale GHA with exemptions & new labels (#152) @mike-wendt
- Add GHA to mark issues/prs as stale/rotten (#150) @Ethyling
- Prepare Changelog for Automation (#135) @ajschmidt8
- Adding Jensen-Shannon and BrayCurtis to DistanceType for Nearest Neighbors (#132) @lowener
- Add brute force KNN (#126) @hlinsen
- Make NCCL root initialization configurable. (#120) @drobison00
- Auto-label PRs based on their content (#117) @jolorunyomi
- Add gather & gatherv to raft::comms::comms_t (#114) @seunghwak
- Adding canberra and chebyshev to distance types (#99) @cjnolet
- Gpuciscripts clean and update (#92) @msadang
- PR #65: Adding cuml prims that break circular dependency between cuml and cumlprims projects
- PR #101: MST core solver
- PR #93: Incorporate Date/Nagi implementation of Hungarian Algorithm
- PR #94: Allow generic reductions for the map then reduce op
- PR #95: Cholesky rank one update prim
- PR #108: Remove unused old-gpubuild.sh
- PR #73: Move DistanceType enum from cuML to RAFT
- pr #92: Cleanup gpuCI scripts
- PR #98: Adding InnerProduct to DistanceType
- PR #103: Epsilon parameter for Cholesky rank one update
- PR #100: Add divyegala as codeowner
- PR #111: Cleanup gpuCI scripts
- PR #120: Update NCCL init process to support root node placement.
- PR #106: Specify dependency branches to avoid pip resolver failure
- PR #77: Fixing CUB include for CUDA < 11
- PR #86: Missing headers for newly moved prims
- PR #102: Check alignment before binaryOp dispatch
- PR #104: Fix update-version.sh
- PR #109: Fixing Incorrect Deallocation Size and Count Bugs
- PR #63: Adding MPI comms implementation
- PR #70: Adding CUB to RAFT cmake
- PR #59: Adding csrgemm2 to cusparse_wrappers.h
- PR #61: Add cusparsecsr2dense to cusparse_wrappers.h
- PR #62: Adding
get_device_allocator
tohandle.pxd
- PR #67: Remove dependence on run-time type info
- PR #56: Fix compiler warnings.
- PR #64: Remove
cublas_try
fromcusolver_wrappers.h
- PR #66: Fixing typo
get_stream
togetStream
inhandle.pyx
- PR #68: Change the type of recvcounts & displs in allgatherv from size_t[] to size_t* and int[] to size_t*, respectively.
- PR #69: Updates for RMM being header only
- PR #74: Fix std_comms::comm_split bug
- PR #79: remove debug print statements
- PR #81: temporarily expose internal NCCL communicator
- PR #12: Spectral clustering.
- PR #7: Migrating cuml comms -> raft comms_t
- PR #18: Adding commsplit to cuml communicator
- PR #15: add exception based error handling macros
- PR #29: Add ceildiv functionality
- PR #44: Add get_subcomm and set_subcomm to handle_t
- PR #13: Add RMM_INCLUDE and RMM_LIBRARY options to allow linking to non-conda RMM
- PR #22: Preserve order in comms workers for rank initialization
- PR #38: Remove #include <cudar_utils.h> from
raft/mr/
- PR #39: Adding a virtual destructor to
raft::handle_t
andraft::comms::comms_t
- PR #37: Clean-up CUDA related utilities
- PR #41: Upgrade to
cusparseSpMV()
, alg selection, and rectangular matrices. - PR #45: Add Ampere target to cuda11 cmake
- PR #47: Use gtest conda package in CMake/build.sh by default
- PR #17: Make destructor inline to avoid redeclaration error
- PR #25: Fix bug in handle_t::get_internal_streams
- PR #26: Fix bug in RAFT_EXPECTS (add parentheses surrounding cond)
- PR #34: Fix issue with incorrect docker image being used in local build script
- PR #35: Remove #include <nccl.h> from
raft/error.hpp
- PR #40: Preemptively fixed future CUDA 11 related errors.
- PR #43: Fixed CUDA version selection mechanism for SpMV.
- PR #46: Fix for cpp file extension issue (nvcc-enforced).
- PR #48: Fix gtest target names in cmake build gtest option.
- PR #49: Skip raft comms test if raft module doesn't exist
- Initial RAFT version
- PR #3: defining raft::handle_t, device_buffer, host_buffer, allocator classes
- PR #5: Small build.sh fixes