Skip to content

Latest commit

 

History

History
58 lines (42 loc) · 2.73 KB

ops_summary.rst

File metadata and controls

58 lines (42 loc) · 2.73 KB

Summary of the Operations

Basics

  • transform applies a function to each element of the sequence, equivalent to the functional operation map
  • select takes the first N` elements of the sequence satisfying a condition (via a selection mask or a predicate function)
  • unique returns unique elements within a sequence
  • histogram generates a summary of the statistical distribution of the sequence

Aggregation

  • reduce traverses the sequence while accumulating some data, equivalent to the functional operation fold_left
  • scan is the cumulative version of reduce which returns the sequence of the intermediate values taken by the accumulator

Differentiation

  • adjacent_difference computes the difference between the current element and the previous or next one in the sequence
  • discontinuity detects value change between the current element and the previous or next one in the sequence

Rearrangement

  • sort rearranges the sequence by sorting it. It could be according to a comparison operator or a value using a radix approach
  • partial_sort rearranges the sequence by sorting it up to and including a given index, according to a comparison operator.
  • nth_element places the nth element in its sorted position, with elements less-than before, and greater after, according to a comparison operator.
  • exchange rearranges the elements according to a different stride configuration which is equivalent to a tensor axis transposition
  • shuffle rotates the elements

Partition/Merge

  • partition divides the sequence into two or more sequences according to a predicate while preserving some ordering properties
  • merge merges two ordered sequences into one while preserving the order

Data Movement

  • store stores the sequence to a continuous memory zone. There are variations to use an optimized path or to specify how to store the sequence to better fit the access patterns of the CUs.
  • load the complementary operations of the above ones.
  • memcpy copies bytes between device sources and destinations

Other operations

  • run_length_encode generates a compact representation of a sequence
  • binary_search finds for each element the index of an element with the same value in another sequence (which has to be sorted)
  • config selects a kernel's grid/block dimensions to tune the operation to a GPU