Skip to content

Commit

Permalink
perf/ring_buffer: Use high order allocations for AUX buffers optimist…
Browse files Browse the repository at this point in the history
…ically

Currently, the AUX buffer allocator will use high-order allocations
for PMUs that don't support hardware scatter-gather chaining to ensure
large contiguous blocks of pages, and always use an array of single
pages otherwise.

There is, however, a tangible performance benefit in using larger chunks
of contiguous memory even in the latter case, that comes from not having
to fetch the next page's address at every page boundary. In particular,
a task running under Intel PT on an Atom CPU shows 1.5%-2% less runtime
penalty with a single multi-page output region in snapshot mode (no PMI)
than with multiple single-page output regions, from ~6% down to ~4%. For
the snapshot mode it does make a difference as it is intended to run over
long periods of time.

For this reason, change the allocation policy to always optimistically
start with the highest possible order when allocating pages for the AUX
buffer, desceding until the allocation succeeds or order zero allocation
fails.

Signed-off-by: Alexander Shishkin <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
  • Loading branch information
virtuoso authored and Ingo Molnar committed Mar 9, 2019
1 parent 6ea98b4 commit 5768402
Showing 1 changed file with 15 additions and 17 deletions.
32 changes: 15 additions & 17 deletions kernel/events/ring_buffer.c
Original file line number Diff line number Diff line change
Expand Up @@ -598,29 +598,27 @@ int rb_alloc_aux(struct ring_buffer *rb, struct perf_event *event,
{
bool overwrite = !(flags & RING_BUFFER_WRITABLE);
int node = (event->cpu == -1) ? -1 : cpu_to_node(event->cpu);
int ret = -ENOMEM, max_order = 0;
int ret = -ENOMEM, max_order;

if (!has_aux(event))
return -EOPNOTSUPP;

if (event->pmu->capabilities & PERF_PMU_CAP_AUX_NO_SG) {
/*
* We need to start with the max_order that fits in nr_pages,
* not the other way around, hence ilog2() and not get_order.
*/
max_order = ilog2(nr_pages);
/*
* We need to start with the max_order that fits in nr_pages,
* not the other way around, hence ilog2() and not get_order.
*/
max_order = ilog2(nr_pages);

/*
* PMU requests more than one contiguous chunks of memory
* for SW double buffering
*/
if ((event->pmu->capabilities & PERF_PMU_CAP_AUX_SW_DOUBLEBUF) &&
!overwrite) {
if (!max_order)
return -EINVAL;
/*
* PMU requests more than one contiguous chunks of memory
* for SW double buffering
*/
if ((event->pmu->capabilities & PERF_PMU_CAP_AUX_SW_DOUBLEBUF) &&
!overwrite) {
if (!max_order)
return -EINVAL;

max_order--;
}
max_order--;
}

rb->aux_pages = kcalloc_node(nr_pages, sizeof(void *), GFP_KERNEL,
Expand Down

0 comments on commit 5768402

Please sign in to comment.