Skip to content

Commit

Permalink
Make instance_list perform per-cell batching
Browse files Browse the repository at this point in the history
This makes the instance_list module support batching across cells
with a couple of different strategies, and with room to add more
in the future.

Before this change, an instance list with limit 1000 to a
deployment with 10 cells would generate a query to each cell
database with the same limit. Thus, that API request could end
up processing up to 10,000 instance records despite only
returning 1000 to the user (because of the limit).

This uses the batch functionality in the base code added in
Iaa4759822e70b39bd735104d03d4deec988d35a1
by providing a couple of strategies by which the batch size
per cell can be determined. These should provide a lot of gain
in the short term, and we can extend them with other strategies
as we identify some with additional benefits.

Closes-Bug: #1787977
Change-Id: Ie3a5f5dc49f8d9a4b96f1e97f8a6ea0b5738b768
  • Loading branch information
kk7ds committed Aug 27, 2018
1 parent 7bc6de3 commit c3a77f8
Show file tree
Hide file tree
Showing 4 changed files with 229 additions and 5 deletions.
36 changes: 35 additions & 1 deletion nova/compute/instance_list.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,37 @@ def get_instances_sorted(ctx, filters, limit, marker, columns_to_join,
ctx, filters, limit, marker, columns_to_join=columns_to_join)


def get_instance_list_cells_batch_size(limit, cells):
"""Calculate the proper batch size for a list request.
This will consider config, request limit, and cells being queried and
return an appropriate batch size to use for querying said cells.
:param limit: The overall limit specified in the request
:param cells: The list of CellMapping objects being queried
:returns: An integer batch size
"""
strategy = CONF.api.instance_list_cells_batch_strategy
limit = limit or CONF.api.max_limit

if len(cells) <= 1:
# If we're limited to one (or no) cell for whatever reason, do
# not do any batching and just pull the desired limit from the
# single cell in one shot.
return limit

if strategy == 'fixed':
# Fixed strategy, always a static batch size
batch_size = CONF.api.instance_list_cells_batch_fixed_size
elif strategy == 'distributed':
# Distributed strategy, 10% more than even partitioning
batch_size = int((limit / len(cells)) * 1.10)

# We never query a larger batch than the total requested, and never
# smaller than the lower limit of 100.
return max(min(batch_size, limit), 100)


def get_instance_objects_sorted(ctx, filters, limit, marker, expected_attrs,
sort_keys, sort_dirs):
"""Same as above, but return an InstanceList."""
Expand All @@ -115,11 +146,14 @@ def get_instance_objects_sorted(ctx, filters, limit, marker, expected_attrs,
context.load_cells()
cell_mappings = context.CELLS

batch_size = get_instance_list_cells_batch_size(limit, cell_mappings)

columns_to_join = instance_obj._expected_cols(expected_attrs)
instance_generator = get_instances_sorted(ctx, filters, limit, marker,
columns_to_join, sort_keys,
sort_dirs,
cell_mappings=cell_mappings)
cell_mappings=cell_mappings,
batch_size=batch_size)

if 'fault' in expected_attrs:
# We join fault above, so we need to make sure we don't ask
Expand Down
55 changes: 55 additions & 0 deletions nova/conf/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,61 @@
are likely to have instances in all cells, then this should be
False. If you have many cells, especially if you confine tenants to a
small subset of those cells, this should be True.
"""),
cfg.StrOpt("instance_list_cells_batch_strategy",
choices=("fixed", "distributed"),
default="distributed",
help="""
This controls the method by which the API queries cell databases in
smaller batches during large instance list operations. If batching is
performed, a large instance list operation will request some fraction
of the overall API limit from each cell database initially, and will
re-request that same batch size as records are consumed (returned)
from each cell as necessary. Larger batches mean less chattiness
between the API and the database, but potentially more wasted effort
processing the results from the database which will not be returned to
the user. Any strategy will yield a batch size of at least 100 records,
to avoid a user causing many tiny database queries in their request.
``distributed`` (the default) will attempt to divide the limit
requested by the user by the number of cells in the system. This
requires counting the cells in the system initially, which will not be
refreshed until service restart or SIGHUP. The actual batch size will
be increased by 10% over the result of ($limit / $num_cells).
``fixed`` will simply request fixed-size batches from each cell, as
defined by ``instance_list_cells_batch_fixed_size``. If the limit is
smaller than the batch size, the limit will be used instead. If you do
not wish batching to be used at all, setting the fixed size equal to
the ``max_limit`` value will cause only one request per cell database
to be issued.
Possible values:
* distributed (default)
* fixed
Related options:
* instance_list_cells_batch_fixed_size
* max_limit
"""),
cfg.IntOpt("instance_list_cells_batch_fixed_size",
min=100,
default=100,
help="""
This controls the batch size of instances requested from each cell
database if ``instance_list_cells_batch_strategy``` is set to ``fixed``.
This integral value will define the limit issued to each cell every time
a batch of instances is requested, regardless of the number of cells in
the system or any other factors. Per the general logic called out in
the documentation for ``instance_list_cells_batch_strategy``, the
minimum value for this is 100 records per batch.
Related options:
* instance_list_cells_batch_strategy
* max_limit
"""),
]

Expand Down
132 changes: 128 additions & 4 deletions nova/tests/unit/compute/test_instance_list.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,8 @@ def setUp(self):
self.context = nova_context.RequestContext()
self.useFixture(fixtures.SpawnIsSynchronousFixture())

self.flags(instance_list_cells_batch_strategy='fixed', group='api')

def test_compare_simple_instance_quirks(self):
# Ensure uuid,asc is added
ctx = instance_list.InstanceSortContext(['key0'], ['asc'])
Expand Down Expand Up @@ -94,7 +96,8 @@ def test_user_gets_subset_of_cells(self, mock_cm, mock_gi, mock_br):
user_context, {}, None, None, [], None, None)
mock_gi.assert_called_once_with(user_context, {}, None, None, [],
None, None,
cell_mappings=mock_cm.return_value)
cell_mappings=mock_cm.return_value,
batch_size=1000)

@mock.patch('nova.context.CELLS', new=FAKE_CELLS)
@mock.patch('nova.context.load_cells')
Expand All @@ -110,7 +113,8 @@ def test_admin_gets_all_cells(self, mock_cm, mock_gi, mock_br, mock_lc):
admin_context, {}, None, None, [], None, None)
mock_gi.assert_called_once_with(admin_context, {}, None, None, [],
None, None,
cell_mappings=FAKE_CELLS)
cell_mappings=FAKE_CELLS,
batch_size=100)
mock_cm.assert_not_called()
mock_lc.assert_called_once_with()

Expand All @@ -128,7 +132,8 @@ def test_user_gets_all_cells(self, mock_cm, mock_gi, mock_br, mock_lc):
user_context, {}, None, None, [], None, None)
mock_gi.assert_called_once_with(user_context, {}, None, None, [],
None, None,
cell_mappings=FAKE_CELLS)
cell_mappings=FAKE_CELLS,
batch_size=100)
mock_lc.assert_called_once_with()

@mock.patch('nova.context.CELLS', new=FAKE_CELLS)
Expand All @@ -147,7 +152,8 @@ def test_admin_gets_all_cells_anyway(self, mock_cm, mock_gi, mock_br,
admin_context, {}, None, None, [], None, None)
mock_gi.assert_called_once_with(admin_context, {}, None, None, [],
None, None,
cell_mappings=FAKE_CELLS)
cell_mappings=FAKE_CELLS,
batch_size=100)
mock_cm.assert_not_called()
mock_lc.assert_called_once_with()

Expand Down Expand Up @@ -177,3 +183,121 @@ def wrap(thing):

# return the results from the up cell, ignoring the down cell.
self.assertEqual(uuid_initial, uuid_final)

def test_batch_size_fixed(self):
fixed_size = 200
self.flags(instance_list_cells_batch_strategy='fixed', group='api')
self.flags(instance_list_cells_batch_fixed_size=fixed_size,
group='api')

# We call the batch size calculator with various arguments, including
# lists of cells which are just counted, so the cardinality is all that
# matters.

# One cell, so batch at $limit
ret = instance_list.get_instance_list_cells_batch_size(
1000, [mock.sentinel.cell1])
self.assertEqual(1000, ret)

# Two cells, so batch at $fixed_size
ret = instance_list.get_instance_list_cells_batch_size(
1000, [mock.sentinel.cell1, mock.sentinel.cell2])

self.assertEqual(fixed_size, ret)

# Four cells, so batch at $fixed_size
ret = instance_list.get_instance_list_cells_batch_size(
1000, [mock.sentinel.cell1, mock.sentinel.cell2,
mock.sentinel.cell3, mock.sentinel.cell4])
self.assertEqual(fixed_size, ret)

# Three cells, tiny limit, so batch at lower threshold
ret = instance_list.get_instance_list_cells_batch_size(
10, [mock.sentinel.cell1,
mock.sentinel.cell2,
mock.sentinel.cell3])
self.assertEqual(100, ret)

# Three cells, limit above floor, so batch at limit
ret = instance_list.get_instance_list_cells_batch_size(
110, [mock.sentinel.cell1,
mock.sentinel.cell2,
mock.sentinel.cell3])
self.assertEqual(110, ret)

def test_batch_size_distributed(self):
self.flags(instance_list_cells_batch_strategy='distributed',
group='api')

# One cell, so batch at $limit
ret = instance_list.get_instance_list_cells_batch_size(1000, [1])
self.assertEqual(1000, ret)

# Two cells so batch at ($limit/2)+10%
ret = instance_list.get_instance_list_cells_batch_size(1000, [1, 2])
self.assertEqual(550, ret)

# Four cells so batch at ($limit/4)+10%
ret = instance_list.get_instance_list_cells_batch_size(1000, [1, 2,
3, 4])
self.assertEqual(275, ret)

# Three cells, tiny limit, so batch at lower threshold
ret = instance_list.get_instance_list_cells_batch_size(10, [1, 2, 3])
self.assertEqual(100, ret)

# Three cells, small limit, so batch at lower threshold
ret = instance_list.get_instance_list_cells_batch_size(110, [1, 2, 3])
self.assertEqual(100, ret)

# No cells, so batch at $limit
ret = instance_list.get_instance_list_cells_batch_size(1000, [])
self.assertEqual(1000, ret)


class TestInstanceListBig(test.NoDBTestCase):
def setUp(self):
super(TestInstanceListBig, self).setUp()

cells = [objects.CellMapping(uuid=getattr(uuids, 'cell%i' % i),
name='cell%i' % i,
transport_url='fake:///',
database_connection='fake://')
for i in range(0, 3)]

insts = list([
dict(
uuid=getattr(uuids, 'inst%i' % i),
hostname='inst%i' % i)
for i in range(0, 100)])

self.cells = cells
self.insts = insts
self.context = nova_context.RequestContext()
self.useFixture(fixtures.SpawnIsSynchronousFixture())

@mock.patch('nova.db.api.instance_get_all_by_filters_sort')
@mock.patch('nova.objects.CellMappingList.get_all')
def test_get_instances_batched(self, mock_cells, mock_inst):
mock_cells.return_value = self.cells

def fake_get_insts(ctx, filters, limit, *a, **k):
for i in range(0, limit):
yield self.insts.pop()

mock_inst.side_effect = fake_get_insts
insts = instance_list.get_instances_sorted(self.context, {},
50, None,
[], ['hostname'], ['desc'],
batch_size=10)

# Make sure we returned exactly how many were requested
insts = list(insts)
self.assertEqual(50, len(insts))

# Since the instances are all uniform, we should have a
# predictable number of queries to the database. 5 queries
# would get us 50 results, plus one more gets triggered by the
# sort to fill the buffer for the first cell feeder that runs
# dry.
self.assertEqual(6, mock_inst.call_count)
11 changes: 11 additions & 0 deletions releasenotes/notes/instance-list-batching-45f90a8b13eef512.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
features:
- |
Instance list operations across cells are now made more efficient by batching queries
as a fraction of the total limit for a request. Before this, an instance list with a
limit of 1000 records (the default) would generate queries to each cell with that
limit, and potentially process/sort/merge $num_cells*$limit records, despite only
returning $limit to the user. The strategy can now be controlled via
``[api]/instance_list_cells_batch_strategy`` and related options to either use
fixed batch sizes, or a fractional value that scales with the number of configured
cells.

0 comments on commit c3a77f8

Please sign in to comment.