Initial design to reduce metadata overhead #169

dryajov · 2023-07-05T05:00:49Z

This document proposes a change to reduce metadata overhead in manifest files as well as consolidating the slot blocks handling with the network block handling.

benbierens · 2023-07-05T06:52:01Z

I like this plan. I have read the doc briefly and I'll read it again some more shortly, but I have a question for you:
Is it required that Merkle trees are binary, aka max 2 elements per node?
I understand how the leaf index as binary gives you the path from root to leaf, and how the hashes along the way are useful for inclusion proofs. But I was thinking... what if we allow 4 elements per node? the leaf index as binary would still represent the path from root to leaf except in chunks of 2 bits instead of 1. The same holds true for 3 bits and 8 elements per node, etc. An approach like this might greatly reduce the number of levels to our trees for large datasets, and I don't see a downside to this. Your thoughts please! :D

tbekas · 2023-07-05T09:05:06Z

@benbierens I think that 2 elements per node is optimal when it comes to the size of the proofs. Think about the extreme case when you have N leaves and N elements per node. The size of the proof for each of the N leaves will be O(N).

@dryajov I like the Idea. I'm not convinced yet that we need to store the entire tree. I think we could (should) store only a merkle proof next to a block and that should suffice for all the use cases in my opinion.

It will take only 2 times more on the disk, but I think it would reduce the complexity of the flows and possibly increase the performance of disk operations (sequential reads only).

tbekas · 2023-07-05T10:36:22Z

design/metadata-overhead.md

+            1. Once new peers have been discovered and connected, go to step 1.1.1
+2. Once blocks are received from the remote nodes
+   1. The hashes are verified against the requested Merkle root and if they pass
+      1. The block is persisted to the network


Probably you meant persisted to the repo/local store

markspanbroek

I like it! Makes a lot sense to use a merkle root instead of the block hashes in the manifest.

design/metadata-overhead.md

markspanbroek · 2023-07-05T11:46:57Z

design/metadata-overhead.md

+
+#### Announcing over the DHT
+
+Also, datasets are now announced by their Merkle root instead of each individual block as was the case in the previous implementation. Announcing individual blocks is still supported, for example manifests are announced exactly the same as before, by their cid. Announcing individual blocks is also supported (but not required) and can be usefull in the case of bandwidth incentives.


Would a Codex node that stores only the data for a single slot, still announce that data under the merkle root of the entire dataset?

Good question - yes, so far I don't see a better way of doing so. There isn't a good way of announcing block ranges over a DHT, but even in that case, blocks in slots aren't continuous. Definitely not ideal, and something to investigate further.

design/metadata-overhead.md

Co-authored-by: markspanbroek <[email protected]>

dryajov · 2023-07-05T14:56:05Z

@benbierens I think that 2 elements per node is optimal when it comes to the size of the proofs. Think about the extreme case when you have N leaves and N elements per node. The size of the proof for each of the N leaves will be O(N).

This, the size of proofs increases with the arity of the tree.

@dryajov I like the Idea. I'm not convinced yet that we need to store the entire tree. I think we could (should) store only a merkle proof next to a block and that should suffice for all the use cases in my opinion.

It will take only 2 times more on the disk, but I think it would reduce the complexity of the flows and possibly increase the performance of disk operations (sequential reads only).

The problem with this approach is that, in the new flow, you only know the root of the tree and the index of the leaf, you don't know the actual leaf or block hash before hand, the merkle tree serves as a kind of an index, where you read the tree and fetch the leaf (block hash) and after you can read it from the repo.

Initial design to reduce metadata overhead

4a0e6ab

tbekas reviewed Jul 5, 2023

View reviewed changes

markspanbroek approved these changes Jul 5, 2023

View reviewed changes

dryajov and others added 5 commits July 5, 2023 08:42

Update design/metadata-overhead.md

0e4af44

Co-authored-by: markspanbroek <[email protected]>

Update design/metadata-overhead.md

0758d0b

Co-authored-by: markspanbroek <[email protected]>

Update design/metadata-overhead.md

3c06388

Co-authored-by: markspanbroek <[email protected]>

Update design/metadata-overhead.md

d653951

Co-authored-by: markspanbroek <[email protected]>

Update design/metadata-overhead.md

0affe6b

Co-authored-by: markspanbroek <[email protected]>

fixed inconcistency in network flow

2079db7

dryajov mentioned this pull request Jul 15, 2023

Fix for decoding large manifests codex-storage/nim-codex#479

Merged

dryajov merged commit 7015f21 into master Jul 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial design to reduce metadata overhead #169

Initial design to reduce metadata overhead #169

dryajov commented Jul 5, 2023 •

edited

Loading

benbierens commented Jul 5, 2023

tbekas commented Jul 5, 2023

tbekas Jul 5, 2023 •

edited

Loading

markspanbroek left a comment

markspanbroek Jul 5, 2023

dryajov Jul 5, 2023

dryajov commented Jul 5, 2023


		#### Announcing over the DHT

		Also, datasets are now announced by their Merkle root instead of each individual block as was the case in the previous implementation. Announcing individual blocks is still supported, for example manifests are announced exactly the same as before, by their cid. Announcing individual blocks is also supported (but not required) and can be usefull in the case of bandwidth incentives.

Initial design to reduce metadata overhead #169

Initial design to reduce metadata overhead #169

Conversation

dryajov commented Jul 5, 2023 • edited Loading

benbierens commented Jul 5, 2023

tbekas commented Jul 5, 2023

tbekas Jul 5, 2023 • edited Loading

Choose a reason for hiding this comment

markspanbroek left a comment

Choose a reason for hiding this comment

markspanbroek Jul 5, 2023

Choose a reason for hiding this comment

dryajov Jul 5, 2023

Choose a reason for hiding this comment

dryajov commented Jul 5, 2023

dryajov commented Jul 5, 2023 •

edited

Loading

tbekas Jul 5, 2023 •

edited

Loading