Time complexity of operations #162

Lars-Meijer · 2017-03-17T10:35:34Z

I think you have a good grasp of the (avarage/estimated) time complexity of operations. I think it would be a good idea to document these in the commands section. This would be especially useful for: SET, GET and the search commands. It might also be useful for FSET, JSET, KEYS ect.

tidwall · 2017-03-17T15:11:48Z

Thanks for the suggestion and I totally agree.
The complexity varies slightly depending on which options are used for each command, but for the most part it should be straight forward to document.
Each command at tile38.com/commands should have their Big O listed.

Lars-Meijer · 2017-03-31T08:55:08Z

You've mentioned earlier that SET has a time complexity of log(mn), I am assuming one is the number of points in the set, what is the other?

If I have 1 set with 100 points I can easily update them 20k/s, but when I have 10k sets with 100 points each (1 million points in total), latency goes up a lot, is the total number of points in the DB also a factor?

I have a lot of points that I keep track of, with lots of updates but almost no get/search queries, so any boost in SET performance would greatly help :)

Edit:
This example is a little extreme, when I use 500,000 points and split it into 10 collections of 50,000 points performance seems to be only marginally better than 1 collection of 500,000 points

tidwall · 2017-03-31T13:23:31Z

The log(mN) is a log(m)+log(n) representing two operations.

Setting a point requires two operations:

Adding the point to the btree which is ordered lexographically by the point ID.
Adding the point to the rtree.

The btree insert complexity is O(log n).
The rtree insert complexity is often on average O(log n). Though it can get more complex depending on factors such as rectangle overlap, and the insertion algorithm. Just to note, the current implementation that Tile38 is using is the standard r-tree with quadratic splits. We may later change this to an r*tree split.

The more points you add the slower it gets but eventually it should level out.

This example is a little extreme, when I use 500,000 points and split it into 10 collections of 50,000 points performance seems to be only marginally better than 1 collection of 500,000 points

You will probably not see much in the way of a performance benefit from splitting up the data set over multiple collections because the collection must be located prior to setting the point. Collections are mainly good for organizing your data like namespaces.

Lars-Meijer · 2017-03-31T13:27:43Z

You will probably not see much in the way of a performance benefit from splitting up the data set over multiple collections because the collection must be located prior to setting the point. Collections are mainly good for organizing your data like namespaces.

I see, I thought each collection would keep its own btree and rtree, thus having faster performance for smaller sets. From my testing it does seem to level out when more points are added.

tidwall · 2017-03-31T13:34:51Z

Each collection does have it's own btree and rtree, but the collections themselves are stored in a parent btree. So each SET must traverse the parent btree first. As far as overall complexity goes, the collection btree is just an extension of the parent btree. Thus not adding any real benefit.

If the collections were stored in a different structure such as a hashmap, then we would see a bonus... at the expense of not having ordered keys.

tidwall added the enhancement label Mar 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Time complexity of operations #162

Time complexity of operations #162

Lars-Meijer commented Mar 17, 2017

tidwall commented Mar 17, 2017

Lars-Meijer commented Mar 31, 2017 •

edited

Loading

tidwall commented Mar 31, 2017

Lars-Meijer commented Mar 31, 2017

tidwall commented Mar 31, 2017

Time complexity of operations #162

Time complexity of operations #162

Comments

Lars-Meijer commented Mar 17, 2017

tidwall commented Mar 17, 2017

Lars-Meijer commented Mar 31, 2017 • edited Loading

tidwall commented Mar 31, 2017

Lars-Meijer commented Mar 31, 2017

tidwall commented Mar 31, 2017

Lars-Meijer commented Mar 31, 2017 •

edited

Loading