Skip to content

philipp-horstenkamp/KDEpy

Repository files navigation

Build Status Build status Documentation Status PyPI version

About

This Python package implements various kernel density esimators (KDE). The long-term goal is to support state-of-the-art KDE algorithms, and eventually have the most complete implementation in the scientific Python universe. As of now, three algorithms are implemented through the same API: NaiveKDE, TreeKDE and FFTKDE.

Plot

The code generating the above graph is found in KDEpy/examples.py.

Installation

KDEpy is available through PyPI, and may be installed using pip:

pip install KDEpy

Example code and documentation

Below is an example using NumPy as np and scipy.stats.norm to plot a density estimate. From the code below, it should be clear how to set the kernel, bandwidth (variance of the kernel) and weights. See the documentation for more examples.

from KDEpy import NaiveKDE
data = norm(loc=0, scale=1).rvs(2**3)
estimator = NaiveKDE(kernel='gaussian', bw='silverman')
x, y = estimator.fit(data, weights=None).evaluate()
plt.plot(x, y, label='KDE estimate')

Plot

The package consists of three algorithms. Here's a brief explanation:

  • NaiveKDE - A naive computation. Supports N-dimensional data, variable bandwidth, weighted data and many kernel functions. Very slow on large data sets.
  • TreeKDE - A tree-based computation. Supports the same features as the naive algorithm, but is faster at the expense of small inaccuracy when using a kernel without finite support.
  • FFTKDE - A fast, FFT-based computation for 1D and 2D data. Supports weighted data and many kernels, but not variable bandwidth. Must be evaluated on an equidistant grid, the finer the grid the higher the accuracy.

Issues and contributing

Issues

If you are having trouble using the package, please let me know by creating an Issue on GitHub and I'll get back to you.

Contributing

Whatever your mathematical and Python background is, you are very welcome to contribute to KDEpy. To contribute, clone the project, create a branch and submit and Pull Request. Please follow these guidelines:

  • Import as few external dependencies as possible.
  • Use test driven development, have tests and docs for every method.
  • Cite literature and implement recent methods.
  • Unless it's a bottleneck computation, readability trumps speed.
  • Employ object orientation, but resist the temptation to implement many methods -- stick to the basics.
  • Follow PEP8.

About

Kernel Density Estimation in Python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 72.9%
  • Python 22.6%
  • Cython 2.7%
  • MATLAB 1.4%
  • M 0.4%