This library provides a Pythonic API wrapper for the reference Arrow C++ implementation, along with tools for interoperability with pandas, NumPy, and other traditional Python scientific computing packages.
This project is layered in two pieces:
- pyarrow, a C++ library for easier interoperability between Arrow C++, NumPy, and pandas
- Cython extensions and pure Python code under arrow/ which expose Arrow C++ and pyarrow to pure Python users
These are the various projects that PyArrow depends on.
- g++ and gcc Version >= 4.8
- cmake > 2.8.6
- boost
- Arrow-cpp and its dependencies*
The Arrow C++ library must be built with all options enabled and installed with
ARROW_HOME
environment variable set to the installation location. Look at
(https://github.com/apache/arrow/blob/master/cpp/README.md) for instructions.
- Python dependencies: numpy, pandas, cython, pytest
python setup.py build_ext --inplace
py.test pyarrow
pip install -r doc/requirements.txt
python setup.py build_sphinx