Releases: motiwari/BanditPAM
BanditPAM v4.0.4
BanditPAM v4.0.4
contains the following changes:
Organization and Functionality:
- Added a new GHA to build the package and run tests on Windows.
- Added a configuration file v2 for our Read the Docs documentation.
- Added a convenience script
retrieve_windows_python_files.sh
to retrieveclang_rt.asan_dynamic-x86_64.dll
automatically for the Windows Python build.
Tests:
- Updated
tests/test_smaller.py
to use different data types on different platforms when loadingdata/scrna_reformat.csv.gz
to adjust memory usage.
Documentation:
- Updated the Windows installation instructions to use the convenience script
retrieve_windows_python_files.sh
.
Full Changelog: v4.0.3...v4.0.4
BanditPAM v4.0.3
BanditPAM v4.0.3
contains the following changes:
Organization and Functionality:
- Ensure GHAs don't run on a tag push.
- Automatically upload wheels to TestPyPI on a PR update and PyPI on release in GHAs (fixes #256)
- Add
tests/test_initialization
to the GHAs that build and test the package. - Loss issues are resolved in all versions of BanditPAM
- The clustering results are now identical to
scikit-learn
's implementation. This was achieved by increasing the batch size for accurate estimation of the standard deviation of the arm parameters and by fixing the bug that drops arms whose lower confidence bounds equal the lowest upper confidence bounds (fixes #252). - Add complexity to
scripts/comparison_utils.py
, such as printing cache writes. - Update cache calculations in
src/algorithms/kmedoids_algorithm.cpp
and add an assertion for better error handling. - Turn
VERSION_INFO
into a string insrc/python_bindings/kmedoids_pywrapper.cpp
. - Update
build_confidence
insrc/python/kmedoids_pywrapper.cpp
to avoid stochasticity issue intests/test_larger.py
.
Tests:
- Update assertion in
tests/test_initialization
to match with newbuild_confidence
insrc/python/kmedoids_pywrapper.cpp
. - Use NumPy and Armadillo seeds for reproducibility.
- Update some tests in
tests/test_larger.py
to count the proportion of passing cases instead of immediately failing.
Style:
- Ran
black
,docformatter
,flake8
, andclang-format
.
Documentation:
- Slightly updated the documentation on the CMake build on Windows to clarify how to use
retrieve_windows_cmake_files.sh
.
Full Changelog: v4.0.2...v4.0.3
BanditPAM v4.0.2
BanditPAM v4.0.2
contains the following changes:
Organization and Functionality:
- We include the files, edits, and instructions to succeed in the CMake and Python builds on Windows, Mac, and Linux.
- Note that
carma
has now bumped to version v0.6.7. - Does not require LLVM's clang when running
pip install .
(Fixes #242) - Swaps are only performed in
banditpam.cpp
,banditpam_orig.cpp
,fastpam1.cpp
, andpam.cpp
if k=1 (Fixes #227) - Uses
bool
data type when necessary (Fixes #232) - Implements the
buildLoss
functionality (Fixes #234) - Modifies the FastPAM1 implementation such that the runtime is significantly reduced while resulting in the same medoids (Fixes #228)
- Ensures Linux GHA error does not appear anymore (Fixes #250)
- Update GHAs to use
python -m pip
throughout instead of justpip
- Mac GHAs now run on
macos-latest
asmacos-10.15
is now unsupported. - Fixes failing GHAs that activated on PR.
Tests:
- Updated incorrect assertion in
tests/test_initialization.py
Style:
- Ran
black
,flake8
, a combination ofclang-format
andcpplint
, and our own style definitions over codebase (such as using 2 spaces instead of 4 in C++ code).
Documentation:
- Documented functional CMake and Python builds on Windows.
Full Changelog: v4.0.1...v4.0.2
BanditPAM v4.0.1
BanditPAM v4.0.1
contains a few minor changes:
Organization and Functionality:
- Makes
cibuildwheel
more verbose - For CMake builds, checks out the current branch (not
main
) in GHA - Changes the CMake build to
Release
mode to avoid running sanitizers in GHA - Adds status badges (Fixes #224)
- Fixes the GHA builds by including the proper directory for
armadillo
(the locally installed version)
Tests:
No changes.
Style:
No changes.
Documentation:
- Updates
README.md
and other docs
BanditPAM v4.0.0
BanditPAM v4.0.0
contains many big changes:
Organization and Functionality:
- We now rely on C++17 for
std::optional
- We implement a new version of BanditPAM that considers only
$n$ non-medoids as arms in the SWAP step, each of which has$k$ values. This leads to a significant speedup. The old BanditPAM is still available asBanditPAM_orig
- Update the GHA to use newer machine images and update the dependencies on Ubuntu 22.04
- Split the cmake build into a separate GHA
- Added Python 3.11 to GHA where possible, removed Python 3.6
- Restores the ability to call OpenMP functions
omp_get_max_threads
andomp_set_num_threads
, which should resolve the remaining issues on M1 Macs - We allow users to pass in a distance matrix (Fixes #164 )
- Now uses
buildConfidence
andswapConfidence
in log (ln) space - Allows users to retrieve the sample complexities and caching statistics
- Adds
scripts/cache_measurements.py
,scripts/compare_banditpam_versions.py
,scripts/comparison_utils.py
,scripts/comparison_with_fasterpam.py
,scripts/dist_mat_test.py
,scripts/timing_dist_mat.py
,scripts/sample_complexity_with_k.py
, andscripts/scaling_with_k.py
- Properly builds Mac wheels using Apple Clang and enabling OpenMP. Tested across both Intel and M1 Macs for
gcc
- andclang
- compiled Python. - Adding
parallelize
flag that only uses OpenMP parallelization if set totrue
, i.e., replacing all#pragma omp parallel for
with#pragma omp parallel for if (this->parallelize)
- Fixing an issue with the python bindings where variable orderings didn't match so they were getting intialized with each other's values
- Changes
max_iter
to100
- Added
tests/test_initialization.py
, but this is not run by GHA - Implements caching of distance computations. Also allows for retrieving cache statistics (hits, misses, writes)
- Allows setting cache parameters (
useCache
,usePerm
, etc.) from Python - Allows setting
parallelize
from Python
Tests:
- Added
tests/test_initialization.py
, but this is not run by GHA
Style:
- Ran
black
over codebase.
Documentation:
- Documented new capabilities.
Full Changelog: v3.0.4...v4.0.0
BanditPAM v3.0.4
BanditPAM v3.0.4
contains a few hotfixes:
Organization and Functionality:
- Fixes the computation of cosine distance (Fixes #182)
- Removes the ability to call OpenMP functions
omp_get_max_threads
andomp_set_num_threads
, which should resolve the remaining issues on M1 Macs (Fixes #167)
Tests:
No changes.
Style:
No changes.
Documentation:
No changes.
Full Changelog: v3.0.3...v3.0.4
BanditPAM v3.0.3
This contains BanditPAM v3.0.3
. This update will be largely invisible to users, but allows for building the Linux and Mac (including Apple Silicon/M1) wheels to upload to PyPi.
Organization and Functionality:
- Building wheels automatically for Linux, Intel Mac, and M1 Mac and uploading them to PyPI via Github actions
Tests:
- None, other than verifying the changes in
Organization and Functionality
work via Github Actions
Style:
- Including newlines between steps of Github Actions
Documentation:
None
Full Changelog: v3.0.2...v3.0.3
BanditPAM v3.0.2
BanditPAM v3.0.2
contains several bugfixes:
Organization and Functionality:
- We now allow the user to set a
seed
for reproducible results (must be called withbanditpam.set_num_threads(1)
for deterministic reproducibility) (Fixes #176) - We have added the
KMedoids.average_loss
attribute to contain the final average clustering loss after fitting (Fixes #174) - We throw an
std::invalid_argument
error properly when specifying an invalid loss function (Fixes #173, Fixes #141)
Tests:
- We now also test
PAM
intests/test_smaller.py
Style:
- We change
PAM
andFastPAM1
to usethis->*lossFn
instead ofKMedoids::cachedLoss
to avoid resetting the cache for them; they do not benefit much from a cache anyway - Nits
Documentation:
- Created documentation for new functions
Full Changelog: v3.0.1...v3.0.2
BanditPAM v3.0.1
BanditPAM v3.0.1 contains a hotfix to ensure it can be installed on Paperspace Gradient and Google Colab.
For Paperspace Gradient:
- allows users to install
banditpam==3.0.1
on Paperspace Gradient instances by installing the necessary dependencies and armadillo 10.8 automatically in setup.py - Builds a recent (>=10.8) armadillo from source
For Google Colab:
- Installs the necessary Ubuntu dependencies
- Fixes a missing space that was conjoining the repo name with the local installation path
- Replaces the MANIFEST.in so the headers are properly included in the source distribution
Full Changelog: v3.0.0...v3.0.1
BanditPAM v3.0.0
BanditPAM v3.0.0 contains several changes:
Organization and Functionality:
- doubles are changed to floats throughout
Tests:
- Python3.10 has been added to the list of python versions to check
- We now verify the package can be built on MacOS
- We separate the different tests into different files
Style:
- We use the appropriate armadillo types throughout for floats
Documentation:
- We have updated the documentation through
- We updated the favicon on readthedocs
- We have updated the installation guides throughout](https://github.com/ThrunGroup/BanditPAM/releases/new)
Full Changelog: v2.0.0...v3.0.0