Indexing system for heterogeneous storages.
- GFLAGS
- GCC >= 4.9.0 (You need to replace the atomic functions with other implementation if your compiler doesn't support C11.)
- CPU: Intel Xeon Gold 6230N
- DRAM: 192GB DDR4 2666
- OS: CentOS 7 Linux
- Compiler: GCC 10.2.1 (Red Hat Developer Toolset 10.1)
- CPU: AMD Ryzen 5 3600
- DRAM: 16G DDR4 3200
- OS: WSL2-Ubuntu 18.04.5 LTS
- Compiler: gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
- Make: GNU Make 4.1 Built for x86_64-pc-linux-gnu
(Suppose H
is the hash table, T
the ART and x
the key, INB
Internal Node Buffer. )
- ART internal node buffer(INB):
- Caches a few KV pair on the path, reducing DRAM accesses and hence overhead. The design goal is that shallow layers of the ART should hold hot data, while deep layers should hold colder / io-type data.
- Node split will create more buffer.
- Internal node split
- When node type changes, the buffer will also be copied.
- When a child is removed from a node4, buffered entries will become new children of that node4. So when a path compression takes place (i.e., when a node4 has only 1 child), the buffer is guaranteed to be empty.
Update(H, x)
: SINK: If the LRU key is less frequently used than updating key, then kick LRU key.Delete(H, x)
: A trigger is set for batch COMPACT (Say,load_factor < MIN_LOAD_FACTOR and there are quite some in ART-INB
)Expand(H)
: Auto COMPACT.Update(T, x)
: FLOAT: Iffreq(x) > freq(x')
for anx'
in parent's buffer, replacex'
withx
and kickx'
down.Query(T, x)
: FLOAT: Just likeUpdate(T, x)
- APIs (CRUD)
- Implement backbone data structures: Hash table, ART.
- Basic DRAM version
- Fix value storage for ART.
- Count-Min speedup:
- One hash computation for multiple lines - chop 128 bit hash value into several.
- Sharing hash computation for hash table and Count-Min.
- Implement index compact: hash refill + buffer lifting.
- Smart data placing schedule (SINK, FLOAT).
- Count-Min Auto cooldown.
- Optimize CMS usage to make query faster (than insert, presumably).
- Concurrency on DRAM:
- Hash
- ART: will use ROWEX
-
local modifications
:-
atomic accesses
: node4 & node16'skey
bytes, node48'schild indexes
and allchildren pointers
. - For node4 and node16, key sorting need to be deleted. When insert, just add new keys at the end. When delete, just set children ptr to NULL
-
-
node replacement
: [1] lock node and parent. [2] copy node info to new node. [3] change parent's child pointer to new node. [4] unlock node and mark as OBSOLETE. -
internal split
: [1] install new node [2] truncate the prefix (change both prefix and its length in single atomic store) -
compaction
-
- Test methods and perhaps a benchmark
- PMem version
- PMem Crash consistency
- Hash: (Mechanism: CAS to mark key as occupied (invalid) -> store value (invalid) -> store real key (valid))
- ART:
-
add_child
-
split
-
remove_child
-
compaction
-
- IO device version: storage and optimization
- Intelligent storage
- Murmur3 from https://github.com/PeterScott/murmur3
- ART from https://github.com/armon/libart
- EBR GC from https://github.com/rmind/libqsbr