A Python/C(++) implementation of Hartigan & Hartigan's dip test for unimodality.
The dip test measures multimodality in a sample by the maximum difference, over all sample points, between the empirical distribution function, and the unimodal distribution function that minimizes that maximum difference. Other than unimodality, it makes no further assumptions about the form of the null distribution.
This library provides two functions:
dipstat
diptest
The first only computes Hartigan's dip statistic. diptest
computes both the
statistic and the p-value. The p-value can be computed using interpolation of a
critical value table (default) or by bootstrapping the null hypothesis.
Note that for larger samples (N > 1e5) this is quite compute and memory intensive.
import numpy as np
import diptest
# generate some bimodal random draws
N = 1000
hN = N // 2
x = np.empty(N, dtype=np.float64)
x[:hN] = np.random.normal(0.4, 1.0, hN)
x[hN:] = np.random.normal(-0.4, 1.0, hN)
# only the dip statistic
dip = diptest.dipstat(x)
# both the dip statistic and p-value
dip, pval = diptest.diptest(x)
numpy
- [Optional]
OpenMP
Parallelisation of the p-value computation using bootstrapping is offered using OpenMP.
OpenMP is disabled by default but can be enabled, see installation section
below.
Multi-threading can be turned off by setting the number of threads equal to 1. See the docstring of diptest
for details.
diptest can be installed from PyPi using:
pip install diptest
Wheels containing the pre-compiled extension are available for:
- Windows x84-64 - CPython 3.8 - 3.12
- Linux x84-64 - CPython 3.8 - 3.12
- MacOS x84-64 - CPython 3.8 - 3.12
- MacOS ARM-64 - CPython 3.8 - 3.12
Note that the wheels vendor/ships OpenMP with the extension to provide parallelisation out-of-the-box. If you run into issue with multiple versions of OpenMP being loaded you have two options: build from source or install a non-bundled wheel.
We provide the same wheels without OpenMP bundled here: https://github.com/RUrlus/diptest/releases You than install the wheel that corresponds to your Python and OS. For example, for CPython 3.11 and MacOS ARM:
pip install diptest-0.8.0-cp311-cp311-macosx_11_0_arm64.whl
If you have a C/C++ compiler available it is advised to install without the wheel as this enables architecture specific optimisations.
pip install diptest --no-binary diptest
Compatible compilers through Pybind11:
- Clang/LLVM 3.3 or newer (for Apple Xcode's clang, this is 5.0.0 or newer)
- GCC 4.8 or newer
- Microsoft Visual Studio 2015 Update 3 or newer
- Intel classic C++ compiler 18 or newer (ICC 20.2 tested in CI)
- Cygwin/GCC (previously tested on 2.5.1)
- NVCC (CUDA 11.0 tested in CI)
- NVIDIA PGI (20.9 tested in CI)
To disable OpenMP use:
SKBUILD_CMAKE_ARGS="-DDIPTEST_ENABLE_OPENMP=OFF" pip install diptest --no-binary diptest
To enable a debug build use:
SKBUILD_CMAKE_ARGS="-DCMAKE_BUILD_TYPE=Debug" pip install diptest --no-binary diptest
To enable the debug print statements use:
SKBUILD_CMAKE_ARGS="-DDIPTEST_ENABLE_DEBUG=ON" pip install diptest --no-binary diptest
then call the function with debug argument set to a value greater than zero:
diptest(x, debug=1)
Hartigan, J. A., & Hartigan, P. M. (1985). The Dip Test of Unimodality. The Annals of Statistics.
Hartigan, P. M. (1985). Computation of the Dip Statistic to Test for Unimodality. Journal of the Royal Statistical Society. Series C (Applied Statistics), 34(3), 320-325.
diptest
is just a Python port of Martin Maechler's R module of the same
name.
The package wrapping the C implementation was originally written by Alistair Muldal.
The fork is an update with a number of changes:
- Fixes a buffer overrun issue in
_dip.c
by reverting to the original C implementation - Python bindings using Pybind11 (C++) instead of Cython
- P-value computation using bootstrapping has been moved down to C++ with optional parallelisation support through OpenMP
- Removed overhead caused by debug branching statements by placing them under a compile-time definition
- Added tests and wheel support
- C implementation of diptest was rewritten in C++ by Prodromos Kolyvakis
diptest
is free software: you can redistribute it and/or modify it under the
terms of the GNU General Public License as published by the Free Software
Foundation, either version 3 of the License, or (at your option) any later
version.