Skip to content

hassancs91/SimplerVectors

Repository files navigation

SimplerVectors

SimplerVectors is a straightforward, beginner-friendly vector database project designed for efficiently handling and querying large-scale high-dimensional data vectors. This project is especially suited for applications like language models in retrieval-augmented generation (RAG) projects. Currently in its early stages, SimplerVectors aims to achieve high performance and scalability, making it ideal for managing very large datasets.

Features

  • Efficient Storage: Leverages NumPy's memory-mapped files to manage large datasets efficiently.
  • Simple Design: The system is designed to be easy to understand and use, even for developers new to vector databases.
  • Vector Normalization and Search: Supports automatic vector normalization and cosine similarity searches to find the top similar vectors.
  • Scalability in Development: Focuses on enhancements for handling growing data volumes and increasing query demands as development progresses.

Usage

Initializing the Database

Adding Vectors

Add vectors to the database with optional metadata:

Querying the Database

Perform a cosine similarity search to find the top 5 vectors similar to a given query vector:

Contributing

Contributions are welcome! This project is still in its early stages, and your input can help shape its future. Please fork the repository, create your feature branch, and submit pull requests.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Stay Tuned... something big is coming soon !

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages