Skip to content

bhargav191098/intelligent-duck

Repository files navigation

Alex Extension for DuckDB 🦆

This repository is based on https://github.com/duckdb/extension-template, check it out if you want to build and ship your own DuckDB extension 🚀

The alex extension allows the user to create alex index structure for any integer-based data column of the table.

Build steps

Remember to clone all the sub-modules using

git clone --recurse-submodules https://github.com/bhargav191098/intelligent-duck.git

Now to build the extension, run:

make

The main binaries that will be built are:

./build/release/duckdb
./build/release/test/unittest
./build/release/extension/alex/alex.duckdb_extension
  • duckdb is the binary for the duckdb shell with the extension code automatically loaded.
  • unittest is the test runner of duckdb. Again, the extension is already linked into the binary.
  • alex.duckdb_extension is the loadable binary as it would be distributed.

Running the extension

To run the extension code, simply start the shell with ./build/release/duckdb.

Please download the respective benchmarking datasets and place it in the same directory as alex_extension.cpp to use the benchmark command : otherwise the benchmarking functionality will not work.

  • create_alex_index : This pragma call automates the creation of ALEX indexes for specified columns in DuckDB tables. It validates table and column existence, identifies column types, and initiates bulk loading of the ALEX index based on the column's data type.

  • alex_find : This function facilitates key-based searches within ALEX indexes in DuckDB. It extracts the payload associated with the provided key. If a payload is found, it is displayed; otherwise, a message indicating that the payload was not found is returned. This pragma function streamlines the process of querying ALEX indexes for specific values, enhancing the efficiency of key-based retrievals in DuckDB.

  • alex_size : alex_size pragma function retrieves and displays the total size, including model and data sizes, of the ALEX index structure in DuckDB, converting the sizes to megabytes for clarity.

  • load_benchmark : It facilitates the creation of a SQL table and loads data from one of four benchmark sources into the specified table, enabling subsequent indexing using the create_alex_index pragma function.

  • create_art_index: Similar to create_alex_index pragma function. This function is used to create an ART index, facilitating benchmarking and comparison between ART and Alex indexes for performance evaluation in DuckDB.

  • insert_into_table : Unlike bulk load, insert_into_table pragma function inserts key-value pairs individually into the table and concurrently adds the key to the Alex index, mimicking the behavior of the standard SQL insert command in a database.

  • auxillary_storage_size : auxillary_storage_size pragma function calculates the total size of auxiliary storage used by alex index structure.

Benchmark datasets

Evaluation graphs

The Learned Indexing provides great throughput with less memory overhead. Obviously running it on x86 based CPUs allows even further boosts.

About

Adding an Alex learned index extension to DuckDB :)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published