Please see https://github.com/rapidsai/raft/releases/tag/v22.10.00a for the latest changes to this development branch.
- Update
mdspan
to account for changes toextents
(#751) @divyegala - Replace csr_adj_graph functions with faster equivalent (#746) @ahendriksen
- Integrate KNN implementation: ivf-flat (#652) @achirkin
- Moving kmeans from cuml to Raft (#605) @lowener
- Relax ivf-flat test recall thresholds (#766) @achirkin
- Restrict the use of
]
to CXX 20 only. (#764) @trivialfis - Update rapids-cmake version for pyraft in update-version.sh (#749) @vyasr
- Use documented header template for doxygen (#773) @galipremsagar
- Switch
language
fromNone
to"en"
in docs build (#721) @galipremsagar
- Update
mdspan
to account for changes toextents
(#751) @divyegala - Implement matrix transpose with mdspan. (#739) @trivialfis
- Implement unravel_index for row-major array. (#723) @trivialfis
- Integrate KNN implementation: ivf-flat (#652) @achirkin
- Use common
js
andcss
code (#779) @galipremsagar - Pin
dask
&distributed
for release (#772) @galipremsagar - Move cmake to the build section. (#763) @vyasr
- Adding old kmeans impl back in (as kmeans_deprecated) (#761) @cjnolet
- Fix for KMeans raw pointers API (#758) @lowener
- Fix KMeans (#756) @divyegala
- Add inline to nccl_sync_stream() (#750) @seunghwak
- Replace csr_adj_graph functions with faster equivalent (#746) @ahendriksen
- Add wrapper functions for ncclGroupStart() and ncclGroupEnd() (#742) @seunghwak
- Fix variadic template type check for mdarrays (#741) @hlinsen
- RMAT rectangular graph generator (#738) @teju85
- Update conda recipes to UCX 1.13.0 (#736) @pentschev
- Add warp-aggregated atomic increment (#735) @ahendriksen
- fix logic bug in include_checker.py utility (#734) @grlee77
- Support 32bit and unsigned indices in bruteforce KNN (#730) @achirkin
- Ability to use ccache to speedup local builds (#729) @teju85
- Pin max version of
cuda-python
to11.7.0
(#728) @Ethyling - Always add
raft::raft_nn_lib
andraft::raft_distance_lib
aliases (#727) @trxcllnt - Add several type aliases and helpers for creating mdarrays (#726) @achirkin
- fix nans in naive kl divergence kernel introduced by div by 0. (#724) @mdoijade
- Use rapids-cmake for cuco (#722) @vyasr
- Update Python classifiers. (#719) @bdice
- Fix sccache (#718) @Ethyling
- Introducing raft::mdspan as an alias (#715) @divyegala
- Update cuco version (#714) @vyasr
- Update conda environment pinnings and update-versions.sh. (#713) @bdice
- Branch 22.08 merge branch 22.06 (#712) @cjnolet
- Testing conda compilers (#705) @cjnolet
- Unpin
dask
&distributed
for development (#704) @galipremsagar - Avoid shadowing CMAKE_ARGS variable in build.sh (#701) @vyasr
- Use unique ptr in
print_device_vector
(#695) @lowener - Add missing Thrust includes (#678) @bdice
- Consolidate C++ conda recipes and add libraft-tests package (#641) @Ethyling
- Moving kmeans from cuml to Raft (#605) @lowener
- Rng: removed cyclic dependency creating hard-to-debug compiler errors (#639) @MatthiasKohl
- Allow enabling NVTX markers by downstream projects after install (#610) @achirkin
- Rng: expose host-rng-state in host-only API (#609) @MatthiasKohl
- For fixing the cuGraph test failures with PCG (#690) @vinaydes
- Fix excessive memory used in selection test (#689) @achirkin
- Revert print vector changes because of std::vector<bool> (#681) @lowener
- fix race in fusedL2knn smem read/write by adding a syncwarp (#679) @mdoijade
- gemm: fix parameter C mistakenly set as const (#664) @achirkin
- Fix SelectionTest: allow different indices when keys are equal. (#659) @achirkin
- Revert recent cmake updates (#657) @cjnolet
- Don't install component dependency files in raft-header only mode (#655) @robertmaynard
- Rng: removed cyclic dependency creating hard-to-debug compiler errors (#639) @MatthiasKohl
- Fixing raft compile bug w/ RNG changes (#634) @cjnolet
- Get
libcudacxx
fromcuco
(#632) @trxcllnt - RNG API fixes (#630) @MatthiasKohl
- Fix mdspan accessor mixin offset policy. (#628) @trivialfis
- Branch 22.06 merge 22.04 (#625) @cjnolet
- fix issue in fusedL2knn which happens when rows are multiple of 256 (#604) @mdoijade
- Restore changes from #653 and #655 and correct cmake component dependencies (#686) @robertmaynard
- Adding handle and stream to pylibraft (#683) @cjnolet
- Map CMake install components to conda library packages (#653) @robertmaynard
- Rng: expose host-rng-state in host-only API (#609) @MatthiasKohl
- mdspan/mdarray template functions and utilities (#601) @divyegala
- Change build.sh to find C++ library by default (#697) @vyasr
- Pin
dask
anddistributed
for release (#693) @galipremsagar - Pin
dask
&distributed
for release (#680) @galipremsagar - Improve logging (#673) @achirkin
- Fix minor errors in CMake configuration (#662) @vyasr
- Pulling mdspan fork (from official rapids repo) into raft to remove dependency (#649) @cjnolet
- Fixing the unit test issue(s) in RAFT (#646) @vinaydes
- Build pyraft with scikit-build (#644) @vyasr
- Some fixes to pairwise distances for cupy integration (#643) @cjnolet
- Require UCX 1.12.1+ (#638) @jakirkham
- Updating raft rng host public API and adding docs (#636) @cjnolet
- Build pylibraft with scikit-build (#633) @vyasr
- Add
cuda_lib_dir
tolibrary_dirs
, allow changingUCX
/RMM
/Thrust
/spdlog
locations via envvars insetup.py
(#624) @trxcllnt - Remove perf prints from MST (#623) @divyegala
- Enable components installation using CMake (#621) @Ethyling
- Allow nullptr as input-indices argument of select_k (#618) @achirkin
- Update CMake pinning to allow newer CMake versions (#617) @vyasr
- Unpin
dask
&distributed
for development (#616) @galipremsagar - Improve performance of select-top-k RADIX implementation (#615) @achirkin
- Moving more prims benchmarks to RAFT (#613) @cjnolet
- Allow enabling NVTX markers by downstream projects after install (#610) @achirkin
- Improve performance of select-top-k WARP_SORT implementation (#606) @achirkin
- Enable building static libs (#602) @trxcllnt
- Update
ucx-py
version (#596) @ajschmidt8 - Fix merge conflicts (#587) @ajschmidt8
- Making cuco, thrust, and mdspan optional dependencies. (#585) @cjnolet
- Some RBC3D fixes (#530) @cjnolet
- Moving some of the remaining linalg prims from cuml (#502) @cjnolet
- Fix badly merged cublas wrappers (#492) @achirkin
- Hiding implementation details for lap, clustering, spectral, and label (#477) @cjnolet
- Adding destructor for std comms and using nccl allreduce for barrier in mpi comms (#473) @cjnolet
- Cleaning up cusparse_wrappers (#441) @cjnolet
- Improvents to RNG (#434) @vinaydes
- Remove RAFT memory management (#400) @viclafargue
- LinAlg impl in detail (#383) @divyegala
- Pin cmake in conda recipe to <3.23 (#600) @dantegd
- Fix make_device_vector_view (#595) @lowener
- Update cuco version. (#592) @vyasr
- Fixing raft headers dir (#574) @cjnolet
- Update update-version.sh (#560) @raydouglass
- find_package(raft) can now be called multiple times safely (#532) @robertmaynard
- Allocate sufficient memory for Hungarian if number of batches > 1 (#531) @ChuckHastings
- Adding lap.hpp back (with deprecation) (#529) @cjnolet
- raft-config is idempotent no matter RAFT_COMPILE_LIBRARIES value (#516) @robertmaynard
- Call initialize() in mpi_comms_t constructor. (#506) @seunghwak
- Improve row-major meanvar kernel via minimizing atomicCAS locks (#489) @achirkin
- Adding destructor for std comms and using nccl allreduce for barrier in mpi comms (#473) @cjnolet
- Add benchmarks (#549) @achirkin
- Unify weighted mean code (#514) @lowener
- single-pass raft::stats::meanvar (#472) @achirkin
- Move
random
package of cuML to RAFT (#449) @divyegala - mdspan integration. (#437) @trivialfis
- Interruptible execution (#433) @achirkin
- make raft sources compilable with clang (#424) @MatthiasKohl
- Span implementation. (#399) @trivialfis
- Adding build script for docs (#589) @cjnolet
- Temporarily disable new
ops-bot
functionality (#586) @ajschmidt8 - Fix commands to get conda output files (#584) @Ethyling
- Link to
cuco
and add faissEXCLUDE_FROM_ALL
option (#583) @trxcllnt - exposing faiss::faiss (#582) @cjnolet
- Pin
dask
anddistributed
version (#581) @galipremsagar - removing exclude_from_all from cuco (#580) @cjnolet
- Adding INSTALL_EXPORT_SET for cuco, rmm, thrust (#579) @cjnolet
- Thrust package name case (#576) @trxcllnt
- Add missing thrust includes to transpose.cuh (#575) @zbjornson
- Use unanchored clang-format version check (#573) @zbjornson
- Fixing accidental removal of thrust target from cmakelists (#571) @cjnolet
- Don't add gtest to build export set or generate a gtest-config.cmake (#565) @trxcllnt
- Set
main
label by default (#559) @galipremsagar - Add local conda channel while looking for conda outputs (#558) @Ethyling
- Updated dask and distributed to >=2022.02.1 (#557) @rlratzel
- Upload packages using testing label for nightlies (#556) @Ethyling
- Add
.github/ops-bot.yaml
config file (#554) @ajschmidt8 - Disabling benchmarks building by default. (#553) @cjnolet
- KNN select-top-k variants (#551) @achirkin
- Adding logger (#550) @cjnolet
- clang-tidy support: improved clang run scripts with latest changes (see cugraph-ops) (#548) @MatthiasKohl
- Pylibraft for pairwise distances (#540) @cjnolet
- mdspan PoC for distance make_blobs (#538) @cjnolet
- Include thrust/sort.h in ball_cover.cuh (#526) @akifcorduk
- Increase parallelism in allgatherv (#525) @seunghwak
- Moving device functions to cuh files and deprecating hpp (#524) @cjnolet
- Use
dynamic_extent
fromstdex
. (#523) @trivialfis - Updating some of the ci check scripts (#522) @cjnolet
- Use shfl_xor in warpReduce for broadcast (#521) @akifcorduk
- Fixing Python conda package and installation (#520) @cjnolet
- Adding instructions to install from conda and build using CPM (#519) @cjnolet
- Implement span storage optimization. (#515) @trivialfis
- RNG test fixes and improvements (#513) @vinaydes
- Moving scores and metrics over to raft::stats (#512) @cjnolet
- Random ball cover in 3d (#510) @cjnolet
- Initializing memory in RBC (#509) @cjnolet
- Adjusting conda packaging to remove duplicate dependencies (#508) @cjnolet
- Moving remaining stats prims from cuml (#507) @cjnolet
- Correcting the namespace (#505) @vinaydes
- Passing stream through commsplit (#503) @cjnolet
- Moving some of the remaining linalg prims from cuml (#502) @cjnolet
- Fixing spectral APIs (#496) @cjnolet
- Fix badly merged cublas wrappers (#492) @achirkin
- Fix integer overflow in distances (#490) @RAMitchell
- Reusing shared libs in gpu ci builds (#487) @cjnolet
- Adding fatbin to shared libs and fixing conda paths in cpu build (#485) @cjnolet
- Add CMake
install
rule for tests (#483) @ajschmidt8 - Adding cpu ci for conda build (#482) @cjnolet
- iUpdating codeowners to use new raft codeowners (#480) @cjnolet
- Hiding implementation details for lap, clustering, spectral, and label (#477) @cjnolet
- Define PTDS via
-D
to fix cache misses in sccache (#476) @trxcllnt - Unpin dask and distributed (#474) @galipremsagar
- Replace
ccache
withsccache
(#471) @ajschmidt8 - More README updates (#467) @cjnolet
- CUBLAS wrappers with switchable host/device pointer mode (#453) @achirkin
- Cleaning up cusparse_wrappers (#441) @cjnolet
- Adding conda packaging for libraft and pyraft (#439) @cjnolet
- Improvents to RNG (#434) @vinaydes
- Hiding implementation details for comms (#409) @cjnolet
- Remove RAFT memory management (#400) @viclafargue
- LinAlg impl in detail (#383) @divyegala
- Simplify raft component CMake logic, and allow compilation without FAISS (#428) @robertmaynard
- One cudaStream_t instance per raft::handle_t (#291) @divyegala
- Removing extra logging from faiss mr (#463) @cjnolet
- Pin
dask
&distributed
versions (#455) @galipremsagar - Replace RMM CUDA Python bindings with those provided by CUDA-Python (#451) @shwina
- Fix comms memory leak (#436) @seunghwak
- Fix C++ doxygen documentation (#426) @achirkin
- Fix clang-format style errors (#425) @achirkin
- Fix using incorrect macro RAFT_CHECK_CUDA in place of RAFT_CUDA_TRY (#415) @achirkin
- Fix CUDA_CHECK_NO_THROW compatibility define (#414) @zbjornson
- Disabling fused l2 knn from bfknn (#407) @cjnolet
- Disabling expanded fused l2 knn to unblock cuml CI (#404) @cjnolet
- Reverting default knn distance to L2Unexpanded for now. (#403) @cjnolet
- README and build fixes before release (#459) @cjnolet
- Updates to Python and C++ Docs (#442) @cjnolet
- error macros: determining buffer size instead of fixed 2048 chars (#420) @MatthiasKohl
- NVTX range helpers (#416) @achirkin
- Splitting fused l2 knn specializations (#461) @cjnolet
- Update cuCollection git tag (#447) @seunghwak
- Remove libcudacxx patch needed for nvcc 11.4 (#446) @robertmaynard
- Unpin
dask
anddistributed
(#440) @galipremsagar - Public apis for remainder of matrix and stats (#438) @divyegala
- Fix bug in producer-consumer buffer exchange which occurs in UMAP test on GV100 (#429) @mdoijade
- Simplify raft component CMake logic, and allow compilation without FAISS (#428) @robertmaynard
- Update ucx-py version on release using rvc (#422) @Ethyling
- Disabling fused l2 knn again. Not sure how this got added back. (#421) @cjnolet
- Adding no throw macro variants (#417) @cjnolet
- Remove
IncludeCategories
from.clang-format
(#412) @codereport - fix nan issues in L2 expanded sqrt KNN distances (#411) @mdoijade
- Consistent renaming of CHECK_CUDA and *_TRY macros (#410) @cjnolet
- Faster matrix-vector-ops (#401) @achirkin
- Adding dev conda environment files. (#397) @cjnolet
- Update to UCX-Py 0.24 (#392) @pentschev
- Branch 21.12 merge 22.02 (#386) @cjnolet
- Hiding implementation details for sparse API (#381) @cjnolet
- Adding distance specializations (#376) @cjnolet
- Use FAISS with RMM (#363) @viclafargue
- Add Fused L2 Expanded KNN kernel (#339) @mdoijade
- Update
.clang-format
to be consistent with all other RAPIDS repos (#300) @codereport - One cudaStream_t instance per raft::handle_t (#291) @divyegala
- Fixing bad host->device copy (#375) @cjnolet
- Fix coalesced access checks in matrix_vector_op (#372) @achirkin
- Port libcudacxx patch from cudf (#370) @dantegd
- Fixing overflow in expanded distances (#365) @cjnolet
- Upgrade
clang
to11.1.0
(#394) @galipremsagar - Fix Changelog Merge Conflicts for
branch-21.12
(#390) @ajschmidt8 - Pin max
dask
&distributed
(#388) @galipremsagar - Removing conflict w/ CUDA_CHECK (#378) @cjnolet
- Update RAFT test directory (#359) @viclafargue
- Update to UCX-Py 0.23 (#358) @pentschev
- Hiding implementation details for random, stats, and matrix (#356) @divyegala
- README updates (#351) @cjnolet
- Use 64 bit CuSolver API for Eigen decomposition (#349) @lowener
- Hiding implementation details for distance primitives (dense + sparse) (#344) @cjnolet
- Unpin
dask
&distributed
in CI (#338) @galipremsagar
- Miscellaneous tech debts/cleanups (#286) @viclafargue
- Accounting for rmm::cuda_stream_pool not having a constructor for 0 streams (#329) @divyegala
- Fix wrong lda parameter in gemv (#327) @achirkin
- Fix
matrixVectorOp
to verify promoted pointer type is still aligned to vectorized load boundary (#325) @viclafargue - Pin rmm to branch-21.10 and remove warnings from kmeans.hpp (#322) @dantegd
- Temporarily pin RMM while refactor removes deprecated calls (#315) @dantegd
- Fix more warnings (#311) @harrism
- Add Hamming, Jensen-Shannon, KL-Divergence, Russell rao and Correlation distance metrics support (#306) @mdoijade
- Pin max
dask
anddistributed
versions to2021.09.1
(#334) @galipremsagar - Make sure we keep the rapids-cmake and raft cal version in sync (#331) @robertmaynard
- Add broadcast with const input iterator (#328) @seunghwak
- Fused L2 (unexpanded) kNN kernel for NN <= 64, without using temporary gmem to store intermediate distances (#324) @mdoijade
- Update with rapids cmake new features (#320) @robertmaynard
- Update to UCX-Py 0.22 (#319) @pentschev
- Fix Forward-Merge Conflicts (#318) @ajschmidt8
- Enable CUDA device code warnings as errors (#307) @harrism
- Remove max version pin for dask & distributed on development branch (#303) @galipremsagar
- Warnings are errors (#299) @harrism
- Use the new RAPIDS.cmake to fetch rapids-cmake (#298) @robertmaynard
- ENH Replace gpuci_conda_retry with gpuci_mamba_retry (#295) @dillon-cullinan
- Miscellaneous tech debts/cleanups (#286) @viclafargue
- Random Ball Cover Algorithm for 2D Haversine/Euclidean (#213) @cjnolet
- expose epsilon parameter to allow precision to to be specified (#275) @ChuckHastings
- Fix support for different input and output types in linalg::reduce (#296) @Nyrio
- Const raft handle in sparse bfknn (#280) @cjnolet
- Add
cuco::cuco
to list of linked libraries (#279) @trxcllnt - Use nested include in destination of install headers to avoid docker permission issues (#263) @dantegd
- Update UCX-Py version to 0.21 (#255) @pentschev
- Fix mst knn test build failure due to RMM device_buffer change (#253) @mdoijade
- Add chebyshev, canberra, minkowksi and hellinger distance metrics (#276) @mdoijade
- Move FAISS ANN wrappers to RAFT (#265) @cjnolet
- Remaining sparse semiring distances (#261) @cjnolet
- removing divye from codeowners (#257) @divyegala
- Pinning cuco to a specific commit hash for release (#304) @rlratzel
- Pin max
dask
&distributed
versions (#301) @galipremsagar - Overlap epilog compute with ldg of next grid stride in pairwise distance & fusedL2NN kernels (#292) @mdoijade
- Always add faiss library alias if it's missing (#287) @trxcllnt
- Use
NVIDIA/cuCollections
repo again (#284) @trxcllnt - Use the 21.08 branch of rapids-cmake as rmm requires it (#278) @robertmaynard
- expose epsilon parameter to allow precision to to be specified (#275) @ChuckHastings
- Fix
21.08
forward-merge conflicts (#274) @ajschmidt8 - Add lds and sts inline ptx instructions to force vector instruction generation (#273) @mdoijade
- Move ANN to RAFT (additional updates) (#270) @cjnolet
- Sparse semirings cleanup + hash table & batching strategies (#269) @divyegala
- Revert "pin dask versions in CI (#260)" (#264" (#264)) @ajschmidt8
- Pass stream to device_scalar::value() calls. (#259) @harrism
- Update get_rmm.cmake to better support CalVer (#258) @harrism
- Add Grid stride pairwise dist and fused L2 NN kernels (#250) @mdoijade
- Fix merge conflicts (#236) @ajschmidt8
- Update UCX-Py version to 0.20 (#254) @pentschev
- cuco git tag update (again) (#248) @seunghwak
- Revert PR #232 for 21.06 release (#246) @dantegd
- Python comms to hold onto server endpoints (#241) @cjnolet
- Fix Thrust 1.12 compile errors (#231) @trxcllnt
- Make sure we use CalVer when checking out rapids-cmake (#230) @robertmaynard
- Loss of Precision in MST weight alteration (#223) @divyegala
- cuco git tag update (#243) @seunghwak
- Update
CHANGELOG.md
links for calver (#233) @ajschmidt8 - Add Grid stride pairwise dist and fused L2 NN kernels (#232) @mdoijade
- Updates to enable HDBSCAN (#208) @cjnolet
- Exposing spectral random seed property (#193) @cjnolet
- Fix pointer arithmetic in spmv smem kernel (#183) @lowener
- Modify default value for rowMajorIndex and rowMajorQuery in bf-knn (#173) @viclafargue
- Remove setCudaMallocWarning() call for libfaiss[@v1.7.0 (#167) @trxcllnt](https://github.com/v1.7.0 (#167) @trxcllnt)
- Add const to KNN handle (#157) @hlinsen
- Fixing codeowners (#194) @cjnolet
- Adjust Hellinger pairwise distance to vaoid NaNs (#189) @lowener
- Add column major input support in contractions_nt kernels with new kernel policy for it (#188) @mdoijade
- Dice formula correction (#186) @lowener
- Scaling knn graph fix connectivities algorithm (#181) @cjnolet
- Fixing RAFT CI & a few small updates for SLHC Python wrapper (#178) @cjnolet
- Add Precomputed to the DistanceType enum (for cuML DBSCAN) (#177) @Nyrio
- Enable matrix::copyRows for row major input (#176) @tfeher
- Add Dice distance to distancetype enum (#174) @lowener
- Porting over recent updates to distance prim from cuml (#172) @cjnolet
- Update KNN (#171) @viclafargue
- Adding translations parameter to brute_force_knn (#170) @viclafargue
- Update Changelog Link (#169) @ajschmidt8
- Map operation (#168) @viclafargue
- Updating sparse prims based on recent changes (#166) @cjnolet
- Prepare Changelog for Automation (#164) @ajschmidt8
- Update 0.18 changelog entry (#163) @ajschmidt8
- MST symmetric/non-symmetric output for SLHC (#162) @divyegala
- Pass pre-computed colors to MST (#154) @divyegala
- Streams upgrade in RAFT handle (RMM backend + create handle from parent's pool) (#148) @afender
- Merge branch-0.18 into 0.19 (#146) @dantegd
- Add device_send, device_recv, device_sendrecv, device_multicast_sendrecv (#144) @seunghwak
- Adding SLHC prims. (#140) @cjnolet
- Moving cuml sparse prims to raft (#139) @cjnolet
- Make NCCL root initialization configurable. (#120) @drobison00
- Add idx_t template parameter to matrix helper routines (#131) @tfeher
- Eliminate CUDA 10.2 as valid for large svd solving (#129) @wphicks
- Update check to allow svd solver on CUDA>=10.2 (#125) @wphicks
- Updating gpu build.sh and debugging threads CI issue (#123) @dantegd
- Adding additional distances (#116) @cjnolet
- Update stale GHA with exemptions & new labels (#152) @mike-wendt
- Add GHA to mark issues/prs as stale/rotten (#150) @Ethyling
- Prepare Changelog for Automation (#135) @ajschmidt8
- Adding Jensen-Shannon and BrayCurtis to DistanceType for Nearest Neighbors (#132) @lowener
- Add brute force KNN (#126) @hlinsen
- Make NCCL root initialization configurable. (#120) @drobison00
- Auto-label PRs based on their content (#117) @jolorunyomi
- Add gather & gatherv to raft::comms::comms_t (#114) @seunghwak
- Adding canberra and chebyshev to distance types (#99) @cjnolet
- Gpuciscripts clean and update (#92) @msadang
- PR #65: Adding cuml prims that break circular dependency between cuml and cumlprims projects
- PR #101: MST core solver
- PR #93: Incorporate Date/Nagi implementation of Hungarian Algorithm
- PR #94: Allow generic reductions for the map then reduce op
- PR #95: Cholesky rank one update prim
- PR #108: Remove unused old-gpubuild.sh
- PR #73: Move DistanceType enum from cuML to RAFT
- pr #92: Cleanup gpuCI scripts
- PR #98: Adding InnerProduct to DistanceType
- PR #103: Epsilon parameter for Cholesky rank one update
- PR #100: Add divyegala as codeowner
- PR #111: Cleanup gpuCI scripts
- PR #120: Update NCCL init process to support root node placement.
- PR #106: Specify dependency branches to avoid pip resolver failure
- PR #77: Fixing CUB include for CUDA < 11
- PR #86: Missing headers for newly moved prims
- PR #102: Check alignment before binaryOp dispatch
- PR #104: Fix update-version.sh
- PR #109: Fixing Incorrect Deallocation Size and Count Bugs
- PR #63: Adding MPI comms implementation
- PR #70: Adding CUB to RAFT cmake
- PR #59: Adding csrgemm2 to cusparse_wrappers.h
- PR #61: Add cusparsecsr2dense to cusparse_wrappers.h
- PR #62: Adding
get_device_allocator
tohandle.pxd
- PR #67: Remove dependence on run-time type info
- PR #56: Fix compiler warnings.
- PR #64: Remove
cublas_try
fromcusolver_wrappers.h
- PR #66: Fixing typo
get_stream
togetStream
inhandle.pyx
- PR #68: Change the type of recvcounts & displs in allgatherv from size_t[] to size_t* and int[] to size_t*, respectively.
- PR #69: Updates for RMM being header only
- PR #74: Fix std_comms::comm_split bug
- PR #79: remove debug print statements
- PR #81: temporarily expose internal NCCL communicator
- PR #12: Spectral clustering.
- PR #7: Migrating cuml comms -> raft comms_t
- PR #18: Adding commsplit to cuml communicator
- PR #15: add exception based error handling macros
- PR #29: Add ceildiv functionality
- PR #44: Add get_subcomm and set_subcomm to handle_t
- PR #13: Add RMM_INCLUDE and RMM_LIBRARY options to allow linking to non-conda RMM
- PR #22: Preserve order in comms workers for rank initialization
- PR #38: Remove #include <cudar_utils.h> from
raft/mr/
- PR #39: Adding a virtual destructor to
raft::handle_t
andraft::comms::comms_t
- PR #37: Clean-up CUDA related utilities
- PR #41: Upgrade to
cusparseSpMV()
, alg selection, and rectangular matrices. - PR #45: Add Ampere target to cuda11 cmake
- PR #47: Use gtest conda package in CMake/build.sh by default
- PR #17: Make destructor inline to avoid redeclaration error
- PR #25: Fix bug in handle_t::get_internal_streams
- PR #26: Fix bug in RAFT_EXPECTS (add parentheses surrounding cond)
- PR #34: Fix issue with incorrect docker image being used in local build script
- PR #35: Remove #include <nccl.h> from
raft/error.hpp
- PR #40: Preemptively fixed future CUDA 11 related errors.
- PR #43: Fixed CUDA version selection mechanism for SpMV.
- PR #46: Fix for cpp file extension issue (nvcc-enforced).
- PR #48: Fix gtest target names in cmake build gtest option.
- PR #49: Skip raft comms test if raft module doesn't exist
- Initial RAFT version
- PR #3: defining raft::handle_t, device_buffer, host_buffer, allocator classes
- PR #5: Small build.sh fixes