Skip to content

Commit

Permalink
fix!: store/cachekv: reduce growth factor for iterator ranging using …
Browse files Browse the repository at this point in the history
…binary searches (cosmos#10024)

This change takes the observation that previous dbm.IsKeyInDomain
which searches for [start, end) was performing too many byteslice
comparisons. Instead we start off by sorting all the values in the
store.unsortedCache, and then apply a modified binary search to
look for values that fall within the domain [start, end)
The procedure involves:
* iterating over all items to build a list of all keys -- O(n)
* invoking sort.Strings immediately, of which
we anyways eventually invoke sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case
* invoking modified binary search which is O(log(n)) * 2 ~ O(log(n))
to search for the [start, end) range indices

for a total approximate complexity of:
Best case:  O(n) + O(n(log(n))) + O(log(n)) ~= O(nlog(n))
Worst case: O(n) + O(n^2) + O(log(n))       ~= O(n^2)

instead of previously:
* iterating over all the unsorted items and invoking dbm.IsKeyInDomain:
bytes.Compare ~ O(n) + O(n*s*e) where s -- len(start), e -- len(end)
for overall complexity of O(n*s*e)
* invoking sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case

for a total approximate complexity of:
Best case:  O(n) + O(n*s*e) + O(nlog(n)) ~= O(n*s*e) ~ O(n^2)
Worst case: O(n) + O(n*s*e) + O(n^2)     ~= O(n*s*e) ~ O(n^2)

Ordinarily we'd combine the n*s*e to be n*m, but really the comparisons
between (start & key, end & key) are profound that it makes sense to
keep them as factors. The overall benchmark results vindicate our choice
of isolating the factors (n*s*e)

The benchmarks show that as the number of keys to iterate grows, the
new code grows gracefully in a somewhat linear growth, notice for
CAcheKVStoreIterator*, when we go from:
* 1,000 to 10,000 keys: 120us->1,600us (13X) old vs 95us->900us (9.47X) new
* 50,000 to 100,000 keys: 19ms->100ms (5.3X) old vs 5.5ms->17ms (3X) new

```shell
time/op
GetValidator-8	              5.8ms ± 2%    4.7ms ± 1%	-17.69%	(p=0.000 n=10+10)
OneBankSendTxPerBlock-8	      3.2ms ± 2%    2.8ms ± 1%	-10.80%	(p=0.000 n=7+10)
OneBankMultiSendTxPerBlock-8  3.1ms ± 3%    2.9ms ± 2%	-8.36%	(p=0.000 n=10+10)
AccountMapperSetAccount-8     8.6µs ± 1%    7.8µs ± 1%	-9.74%	(p=0.000 n=10+10)
CacheKVStoreIterator500-8     64µs ± 6%	    51µs ± 6%	-19.22%	(p=0.000 n=10+9)
CacheKVStoreIterator1000-8    0.12ms ± 4%   95µs ± 4%	-19.55%	(p=0.000 n=10+10)
CacheKVStoreIterator10000-8   1.6ms ± 4%    0.90ms ± 1%	-42.11%	(p=0.000 n=10+10)
CacheKVStoreIterator50000-8   19ms ± 5%	    5.5ms ± 1%	-71.35%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8  0.10s ± 23%   17ms ± 7%	-83.44%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8   1.3µs ± 6%    0.90µs ± 3%	-31.19%	(p=0.000 n=9+9)
CacheKVStoreGetKeyFound-8     0.66µs ± 6%   0.56µs ± 2%	-14.81%	(p=0.000 n=10+9)

alloc/op
B/op
BlockProvision-8	     0.11kB ± 0%    0.10kB ± 0%	-7.14%	(p=0.000 n=10+10)
CacheKVStoreIterator50000-8  0.89MB ± 6%    0.53MB ± 1%	-40.85%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8 6.3MB ± 23%    1.6MB ± 6%	-74.17%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8  0.26kB ± 0%    0.23kB ± 1%	-11.53%	(p=0.000 n=10+8)

allocs/op (count)
AccountMapperSetAccount-8    42 ± 0%	    38 ± 0%	-9.52%	(p=0.000 n=10+10)
BlockProvision-8	     6.0 ± 0%	    5.0 ± 0%	-16.67%	(p=0.000 n=10+10)
CacheKVStoreIterator1000-8   14 ± 0%	    13 ± 0%	-7.14%	(p=0.002 n=8+10)
CacheKVStoreIterator10000-8  0.15k ± 2%	    76 ± 1%	-49.00%	(p=0.000 n=7+10)
CacheKVStoreIterator50000-8  8.9k ± 11%	    2.0k ± 2%	-77.60%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8 0.10M ± 26%    13k ± 12%	-86.89%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8  5.0 ± 0%	    4.0 ± 0%	-20.00%	(p=0.000 n=10+10)
```

Note: Purposefully using a commit off master that doesn't
include the buggy code that caused x/bank.BenchmarkOneBank* to fail
per issue cosmos#10023

Updates cosmos#9876

/cc @cuonglm @kirbyquerby

<!--
The default pull request template is for types feat, fix, or refactor.
For other templates, add one of the following parameters to the url:
- template=docs.md
- template=other.md
-->

## Description

Closes: #XXXX

<!-- Add a description of the changes that this PR introduces and the files that
are the most critical to review. -->

---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [x] added `!` to the type prefix if API or client breaking change
- [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting))
- [x] provided a link to the relevant issue or specification
- [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules)
- [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing)
- [ ] added a changelog entry to `CHANGELOG.md`
- [x] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [ ] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed 
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)
  • Loading branch information
odeke-em authored Oct 14, 2021
1 parent 3d8cac3 commit 3c85944
Show file tree
Hide file tree
Showing 3 changed files with 268 additions and 12 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ Ref: https://keepachangelog.com/en/1.0.0/
* [\#9837](https://github.com/cosmos/cosmos-sdk/issues/9837) `--generate-only` flag will accept the keyname now.
* [\#10045](https://github.com/cosmos/cosmos-sdk/pull/10045) Revert [#8549](https://github.com/cosmos/cosmos-sdk/pull/8549). Do not route grpc queries through Tendermint.
* [\#10326](https://github.com/cosmos/cosmos-sdk/pull/10326) `x/authz` add query all grants by granter query.
* [\#10024](https://github.com/cosmos/cosmos-sdk/pull/10024) `store/cachekv` performance improvement by reduced growth factor for iterator ranging by using binary searches to find dirty items when unsorted key count >= 1024

### API Breaking Changes

Expand Down
141 changes: 141 additions & 0 deletions store/cachekv/search_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
package cachekv

import "testing"

func TestFindStartIndex(t *testing.T) {
tests := []struct {
name string
sortedL []string
query string
want int
}{
{
name: "non-existent value",
sortedL: []string{"a", "b", "c", "d", "e", "l", "m", "n", "u", "v", "w", "x", "y", "z"},
query: "o",
want: 8,
},
{
name: "dupes start at index 0",
sortedL: []string{"a", "a", "a", "b", "c", "d", "e", "l", "m", "n", "u", "v", "w", "x", "y", "z"},
query: "a",
want: 0,
},
{
name: "dupes start at non-index 0",
sortedL: []string{"a", "c", "c", "c", "c", "d", "e", "l", "m", "n", "u", "v", "w", "x", "y", "z"},
query: "c",
want: 1,
},
{
name: "at end",
sortedL: []string{"a", "e", "u", "v", "w", "x", "y", "z"},
query: "z",
want: 7,
},
{
name: "dupes at end",
sortedL: []string{"a", "e", "u", "v", "w", "x", "y", "z", "z", "z", "z"},
query: "z",
want: 7,
},
{
name: "entirely dupes",
sortedL: []string{"z", "z", "z", "z", "z"},
query: "z",
want: 0,
},
{
name: "non-existent but within >=start",
sortedL: []string{"z", "z", "z", "z", "z"},
query: "p",
want: 0,
},
{
name: "non-existent and out of range",
sortedL: []string{"d", "e", "f", "g", "h"},
query: "z",
want: -1,
},
}

for _, tt := range tests {
tt := tt
t.Run(tt.name, func(t *testing.T) {
body := tt.sortedL
got := findStartIndex(body, tt.query)
if got != tt.want {
t.Fatalf("Got: %d, want: %d", got, tt.want)
}
})
}
}

func TestFindEndIndex(t *testing.T) {
tests := []struct {
name string
sortedL []string
query string
want int
}{
{
name: "non-existent value",
sortedL: []string{"a", "b", "c", "d", "e", "l", "m", "n", "u", "v", "w", "x", "y", "z"},
query: "o",
want: 7,
},
{
name: "dupes start at index 0",
sortedL: []string{"a", "a", "a", "b", "c", "d", "e", "l", "m", "n", "u", "v", "w", "x", "y", "z"},
query: "a",
want: 0,
},
{
name: "dupes start at non-index 0",
sortedL: []string{"a", "c", "c", "c", "c", "d", "e", "l", "m", "n", "u", "v", "w", "x", "y", "z"},
query: "c",
want: 1,
},
{
name: "at end",
sortedL: []string{"a", "e", "u", "v", "w", "x", "y", "z"},
query: "z",
want: 7,
},
{
name: "dupes at end",
sortedL: []string{"a", "e", "u", "v", "w", "x", "y", "z", "z", "z", "z"},
query: "z",
want: 7,
},
{
name: "entirely dupes",
sortedL: []string{"z", "z", "z", "z", "z"},
query: "z",
want: 0,
},
{
name: "non-existent and out of range",
sortedL: []string{"z", "z", "z", "z", "z"},
query: "p",
want: -1,
},
{
name: "non-existent and out of range",
sortedL: []string{"d", "e", "f", "g", "h"},
query: "z",
want: 4,
},
}

for _, tt := range tests {
tt := tt
t.Run(tt.name, func(t *testing.T) {
body := tt.sortedL
got := findEndIndex(body, tt.query)
if got != tt.want {
t.Fatalf("Got: %d, want: %d", got, tt.want)
}
})
}
}
138 changes: 126 additions & 12 deletions store/cachekv/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -179,8 +179,94 @@ func (store *Store) iterator(start, end []byte, ascending bool) types.Iterator {
return newCacheMergeIterator(parent, cache, ascending)
}

func findStartIndex(strL []string, startQ string) int {
// Modified binary search to find the very first element in >=startQ.
if len(strL) == 0 {
return -1
}

var left, right, mid int
right = len(strL) - 1
for left <= right {
mid = (left + right) >> 1
midStr := strL[mid]
if midStr == startQ {
// Handle condition where there might be multiple values equal to startQ.
// We are looking for the very first value < midStL, that i+1 will be the first
// element >= midStr.
for i := mid - 1; i >= 0; i-- {
if strL[i] != midStr {
return i + 1
}
}
return 0
}
if midStr < startQ {
left = mid + 1
} else { // midStrL > startQ
right = mid - 1
}
}
if left >= 0 && left < len(strL) && strL[left] >= startQ {
return left
}
return -1
}

func findEndIndex(strL []string, endQ string) int {
if len(strL) == 0 {
return -1
}

// Modified binary search to find the very first element <endQ.
var left, right, mid int
right = len(strL) - 1
for left <= right {
mid = (left + right) >> 1
midStr := strL[mid]
if midStr == endQ {
// Handle condition where there might be multiple values equal to startQ.
// We are looking for the very first value < midStL, that i+1 will be the first
// element >= midStr.
for i := mid - 1; i >= 0; i-- {
if strL[i] < midStr {
return i + 1
}
}
return 0
}
if midStr < endQ {
left = mid + 1
} else { // midStrL > startQ
right = mid - 1
}
}

// Binary search failed, now let's find a value less than endQ.
for i := right; i >= 0; i-- {
if strL[i] < endQ {
return i
}
}

return -1
}

type sortState int

const (
stateUnsorted sortState = iota
stateAlreadySorted
)

// Constructs a slice of dirty items, to use w/ memIterator.
func (store *Store) dirtyItems(start, end []byte) {
startStr, endStr := conv.UnsafeBytesToStr(start), conv.UnsafeBytesToStr(end)
if startStr > endStr {
// Nothing to do here.
return
}

n := len(store.unsortedCache)
unsorted := make([]*kv.Pair, 0)
// If the unsortedCache is too big, its costs too much to determine
Expand All @@ -189,24 +275,49 @@ func (store *Store) dirtyItems(start, end []byte) {
// O(N^2) overhead.
// Even without that, too many range checks eventually becomes more expensive
// than just not having the cache.
if n >= 1024 {
for key := range store.unsortedCache {
cacheValue := store.cache[key]
unsorted = append(unsorted, &kv.Pair{Key: []byte(key), Value: cacheValue.value})
}
} else {
// else do a linear scan to determine if the unsorted pairs are in the pool.
if n < 1024 {
for key := range store.unsortedCache {
if dbm.IsKeyInDomain(conv.UnsafeStrToBytes(key), start, end) {
cacheValue := store.cache[key]
unsorted = append(unsorted, &kv.Pair{Key: []byte(key), Value: cacheValue.value})
}
}
store.clearUnsortedCacheSubset(unsorted, stateUnsorted)
return
}

// Otherwise it is large so perform a modified binary search to find
// the target ranges for the keys that we should be looking for.
strL := make([]string, 0, n)
for key := range store.unsortedCache {
strL = append(strL, key)
}
store.clearUnsortedCacheSubset(unsorted)
sort.Strings(strL)

// Now find the values within the domain
// [start, end)
startIndex := findStartIndex(strL, startStr)
endIndex := findEndIndex(strL, endStr)

if endIndex < 0 {
endIndex = len(strL) - 1
}
if startIndex < 0 {
startIndex = 0
}

kvL := make([]*kv.Pair, 0)
for i := startIndex; i <= endIndex; i++ {
key := strL[i]
cacheValue := store.cache[key]
kvL = append(kvL, &kv.Pair{Key: []byte(key), Value: cacheValue.value})
}

// kvL was already sorted so pass it in as is.
store.clearUnsortedCacheSubset(kvL, stateAlreadySorted)
}

func (store *Store) clearUnsortedCacheSubset(unsorted []*kv.Pair) {
func (store *Store) clearUnsortedCacheSubset(unsorted []*kv.Pair, sortState sortState) {
n := len(store.unsortedCache)
if len(unsorted) == n { // This pattern allows the Go compiler to emit the map clearing idiom for the entire map.
for key := range store.unsortedCache {
Expand All @@ -217,9 +328,12 @@ func (store *Store) clearUnsortedCacheSubset(unsorted []*kv.Pair) {
delete(store.unsortedCache, conv.UnsafeBytesToStr(kv.Key))
}
}
sort.Slice(unsorted, func(i, j int) bool {
return bytes.Compare(unsorted[i].Key, unsorted[j].Key) < 0
})

if sortState == stateUnsorted {
sort.Slice(unsorted, func(i, j int) bool {
return bytes.Compare(unsorted[i].Key, unsorted[j].Key) < 0
})
}

for _, item := range unsorted {
if item.Value == nil {
Expand Down

0 comments on commit 3c85944

Please sign in to comment.