This crate intends to build a native python binding.
pip install databend
from databend import SessionContext
ctx = SessionContext()
df = ctx.sql("select number, number + 1, number::String as number_p_1 from numbers(8)")
# convert to pyarrow
df.to_py_arrow()
# convert to pandas
df.to_pandas()
supported functions:
- register_parquet
- register_ndjson
- register_csv
- register_tsv
ctx.register_parquet("pa", "/home/sundy/dataset/hits_p/", pattern = ".*.parquet")
ctx.sql("select * from pa limit 10").collect()
ctx = SessionContext(tenant = "a")
Setup virtualenv:
python -m venv .venv
Activate venv:
source .venv/bin/activate
Install maturin
:
pip install "maturin[patchelf]"
Build bindings:
maturin develop
Run tests:
maturin develop -E test
Build API docs:
maturin develop -E docs
pdoc databend
- Meta Storage directory(Catalogs, Databases, Tables, Partitions, etc.):
./.databend/_meta
- Data Storage directory:
./.databend/_data
- Cache Storage directory:
./.databend/_cache
- Logs directory:
./.databend/logs
Databend python api is inspired by arrow-datafusion-python, thanks for their great work.