MuopDB is a vector database for machine learning. Currently, it supports:
- Index type: HNSW, IVF, SPANN. All on-disk with mmap.
- Quantization: product quantization
Here are the plans for future MuopDB:
- Query path
- Vector similarity search
- Hierarchical Navigable Small Worlds (HNSW)
- Product Quantization (PQ)
- Indexing path
- Support periodic offline indexing
- Database Management
- Doc-sharding & query fan-out with aggregator-leaf architecture
- In-memory & disk-based storage with mmap
- Query & Indexing
- Inverted File (IVF)
- Improve locality for HNSW
- SPANN
- Query
- Multiple index segments
- L2 distance
- Index
- Optimizing index build time
- Elias-Fano encoding for IVF
- RabitQ quantization
- Misc
- Configs and documentations
This is an educational project for me to learn Rust & vector database.
Install prerequisites:
- Rust: https://www.rust-lang.org/tools/install
- Others
# macos
brew install hdf5 protobuf
export HDF5_DIR="$(brew --prefix hdf5)"
Build:
# from top-level workspace
cargo build --release
Test:
cargo test --release
This project is done with TechCare Coaching. I am mentoring mentees who made contributions to this project.