Fast Gauss transforms.
The Gauss transform is a common operation that computes the per-point similarity between two data sets:
This a C++ library for computing the Gauss transform using the direct method as well as a few shortcuts. This code lives on Github, has some Doxygen documentation, and is tested with Travis and AppVeyor.
There is one C++ header file, fgt.hpp
, which has everything you need.
Include that file and you're off to the races:
#include <fgt.hpp>
void my_great_function(const Eigen::MatrixXd& x, const Eigen::MatrixXd& y) {
double bandwidth = 0.3;
Eigen::VectorXd gauss_transform = fgt::direct(x, y, bandwidth);
}
The library provides a few different ways to calculate the Gauss transform:
fgt::direct
calculates the exact Gauss transform, and is the most accurate and the slowest option.fgt::direct_tree
tries to do less work by only considering "close" points, where "close" is defined by the bandwidth. The direct tree method works best for very small bandwidths.fgt::ifgt
uses the Improved Fast Gauss Transform (pdf) to speed up the calculation. IFGT is fast for large bandwidths but can break down for smaller bandwidths.
There's also a class-based interface:
#include <fgt.hpp>
void my_great_function(const Eigen::MatrixXd& x, const Eigen::MatrixXd& y) {
double bandwidth = 0.3;
fgt::Direct direct(x, bandwidth);
Eigen::VectorXd result = direct.compute(y);
}
This lets you break up your transform into a pre-compute and a compute step, which can save you some cycles if you're re-using the same source dataset in a more advanced transform (e.g. direct_tree or ifgt).
There is some benchmarking code available in the bench directory, which you can use to try to get a sense of the performance of the various modes. We found a crossover point at bandwidths of a bit more than 0.2 during local testing on a Mac laptop; YMMV.
fgt has no runtime dependencies, and only depends on CMake and Eigen for building.
If you're on a Mac and you use Homebrew, use my tap to install:
brew tap gadomski/gadomski
brew install fgt
To build fgt from source on a *nix system, clone the repository and execute the traditional CMake build incantation:
git clone https://github.com/gadomski/fgt
mkdir -p fgt/build
cd fgt/build
cmake ..
make
fgt doesn't make any assumptions about whether you do or do not want shared libraries, so if you have a preference be sure to set BUILD_SHARED_LIBS
.
If building with MSVC, we recommend:
- Setting
BUILD_SHARED_LIBS=OFF
, at least if you want to run the test suite out-of-the-box. - Setting
gtest_force_shared_crt=ON
. This allows an out-of-the-box build without any Visual Studio config mucking.
Eigen, by default, stores matrices in column-major order, but fgt works with row-major matrices.
If you want to avoid extra copies, pass in row-major matrices to fgt functions.
You can use the fgt::Matrix
typedef to help:
fgt::Matrix my_matrix(1000, 3); // creates an uninitialized 1000x3 row-major matrix of doubles
fgt comes with built-in OpenMP parallelization, which can lead to some significant speedups for large data sets.
To enable OpenMP, make sure you're using an OpenMP-aware compiler (on OSX, you can get OpenMP clang via Homebrew: brew install clang-omp
) and set the CMake variable WITH_OPENMP
to ON, e.g.:
CC=clang-omp CXX=clang-omp++ cmake .. -DWITH_OPENMP=ON
make
This will build an OpenMP-enabled fgt library.
fgt comes with a unit-test suite.
To run, simply execute make test
.
You probably want CMAKE_BUILD_TYPE=Release
, otherwise the tests will take a while.
When you install fgt on your system, you will also install a few CMake configuration files that make it easy to integrate this project into your downstream work.
If fgt is installed to a traditional location (e.g. /usr/local
), finding fgt might be as simple as including the following in your CMakeLists.txt
:
find_package(Fgt REQUIRED)
target_link_libraries(my-sweet-target
PUBLIC
Fgt::Library-C++
)
The provided target, Fgt::Library-C++
, should have its INTERFACE_INCLUDE_DIRECTORIES
and other useful properties configured, so you shouldn't have to muck with anything else.
If you've installed fgt to a non-standard location, you may need to use Fgt_DIR
to find its CMake configuration files, or use CMAKE_PREFIX_PATH
to point CMake at your non-standard install tree.
We follow semantic versioning.
Versions have annotated tags following a vMAJOR.MINOR.PATCH
naming convention.
While we'll do our best to increment the MINOR
version with all breaking changes, we can't guarantee anything until MAJOR
hits 1 (per usual).
As always, we welcome bug reports, features requests, and particularly pull requests via Github.
This library was developed by Pete Gadomski, and it was inspired by the IFGT source code and figtree.
LGPL v2.1. See LICENSE.txt for the complete text.