Skip to content

Commit

Permalink
ARROW-11179: [Format] Make FB comments friendly to rust
Browse files Browse the repository at this point in the history
### Problem

Currently, comments in FB files are directly copied to rust and c++ source codes. That's great but `rust` suffers from the generated codes, for example:

- array element `abc[1]` or link label `[smith2017knl]` causes `broken intra doc links` warning
- example code/figure blocks are flatten into one line, see [arrow 2.0.0 doc](https://docs.rs/arrow/2.0.0/arrow/ipc/gen/SparseTensor/struct.SparseTensorIndexCSF.html#method.indptrType)

The above problems may lead to failures or warnings when run `rust test --doc` or `rust doc`.
So the generated `.rs` files have to be manually modified to fix the above problems.

### This PR

This PR changed three FB files by formatting some comments to make them friendly to rust.

Closes apache#9299 from mqy/fb-comments

Authored-by: mqy <[email protected]>
Signed-off-by: Micah Kornfield <[email protected]>
  • Loading branch information
mqy authored and emkornfield committed Jan 30, 2021
1 parent f58f29d commit dfaa215
Show file tree
Hide file tree
Showing 3 changed files with 31 additions and 29 deletions.
2 changes: 1 addition & 1 deletion format/Message.fbs
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ namespace org.apache.arrow.flatbuf;
/// Metadata about a field at some level of a nested type tree (but not
/// its children).
///
/// For example, a List<Int16> with values [[1, 2, 3], null, [4], [5, 6], null]
/// For example, a List<Int16> with values `[[1, 2, 3], null, [4], [5, 6], null]`
/// would have {length: 5, null_count: 2} for its List node, and {length: 6,
/// null_count: 0} for its Int16 node, as separate FieldNode structs
struct FieldNode {
Expand Down
5 changes: 3 additions & 2 deletions format/Schema.fbs
Original file line number Diff line number Diff line change
Expand Up @@ -110,10 +110,11 @@ table FixedSizeList {
/// not enforced.
///
/// Map
/// ```text
/// - child[0] entries: Struct
/// - child[0] key: K
/// - child[1] value: V
///
/// ```
/// Neither the "entries" field nor the "key" field may be nullable.
///
/// The metadata is structured so that Arrow systems without special handling
Expand All @@ -129,7 +130,7 @@ enum UnionMode:short { Sparse, Dense }
/// A union is a complex type with children in Field
/// By default ids in the type vector refer to the offsets in the children
/// optionally typeIds provides an indirection between the child offset and the type id
/// for each child typeIds[offset] is the id used in the type vector
/// for each child `typeIds[offset]` is the id used in the type vector
table Union {
mode: UnionMode;
typeIds: [ int ]; // optional, describes typeid of each child.
Expand Down
53 changes: 27 additions & 26 deletions format/SparseTensor.fbs
Original file line number Diff line number Diff line change
Expand Up @@ -37,21 +37,21 @@ namespace org.apache.arrow.flatbuf;
///
/// For example, let X be a 2x3x4x5 tensor, and it has the following
/// 6 non-zero values:
///
/// ```text
/// X[0, 1, 2, 0] := 1
/// X[1, 1, 2, 3] := 2
/// X[0, 2, 1, 0] := 3
/// X[0, 1, 3, 0] := 4
/// X[0, 1, 2, 1] := 5
/// X[1, 2, 0, 4] := 6
///
/// ```
/// In COO format, the index matrix of X is the following 4x6 matrix:
///
/// ```text
/// [[0, 0, 0, 0, 1, 1],
/// [1, 1, 1, 2, 1, 2],
/// [2, 2, 3, 1, 2, 0],
/// [0, 1, 0, 0, 3, 4]]
///
/// ```
/// When isCanonical is true, the indices is sorted in lexicographical order
/// (row-major order), and it does not have duplicated entries. Otherwise,
/// the indices may not be sorted, or may have duplicated entries.
Expand Down Expand Up @@ -86,26 +86,27 @@ table SparseMatrixIndexCSX {

/// indptrBuffer stores the location and size of indptr array that
/// represents the range of the rows.
/// The i-th row spans from indptr[i] to indptr[i+1] in the data.
/// The i-th row spans from `indptr[i]` to `indptr[i+1]` in the data.
/// The length of this array is 1 + (the number of rows), and the type
/// of index value is long.
///
/// For example, let X be the following 6x4 matrix:
///
/// ```text
/// X := [[0, 1, 2, 0],
/// [0, 0, 3, 0],
/// [0, 4, 0, 5],
/// [0, 0, 0, 0],
/// [6, 0, 7, 8],
/// [0, 9, 0, 0]].
///
/// ```
/// The array of non-zero values in X is:
///
/// ```text
/// values(X) = [1, 2, 3, 4, 5, 6, 7, 8, 9].
///
/// ```
/// And the indptr of X is:
///
/// ```text
/// indptr(X) = [0, 2, 3, 5, 5, 8, 10].
/// ```
indptrBuffer: Buffer (required);

/// The type of values in indicesBuffer
Expand All @@ -116,17 +117,17 @@ table SparseMatrixIndexCSX {
/// The type of index value is long.
///
/// For example, the indices of the above X is:
///
/// ```text
/// indices(X) = [1, 2, 2, 1, 3, 0, 2, 3, 1].
///
/// ```
/// Note that the indices are sorted in lexicographical order for each row.
indicesBuffer: Buffer (required);
}

/// Compressed Sparse Fiber (CSF) sparse tensor index.
table SparseTensorIndexCSF {
/// CSF is a generalization of compressed sparse row (CSR) index.
/// See [smith2017knl]: http://shaden.io/pub-files/smith2017knl.pdf
/// See [smith2017knl](http://shaden.io/pub-files/smith2017knl.pdf)
///
/// CSF index recursively compresses each dimension of a tensor into a set
/// of prefix trees. Each path from a root to leaf forms one tensor
Expand All @@ -135,7 +136,7 @@ table SparseTensorIndexCSF {
///
/// For example, let X be a 2x3x4x5 tensor and let it have the following
/// 8 non-zero values:
///
/// ```text
/// X[0, 0, 0, 1] := 1
/// X[0, 0, 0, 2] := 2
/// X[0, 1, 0, 0] := 3
Expand All @@ -144,34 +145,34 @@ table SparseTensorIndexCSF {
/// X[1, 1, 1, 0] := 6
/// X[1, 1, 1, 1] := 7
/// X[1, 1, 1, 2] := 8
///
/// ```
/// As a prefix tree this would be represented as:
///
/// ```text
/// 0 1
/// / \ |
/// 0 1 1
/// / / \ |
/// 0 0 1 1
/// /| /| | /| |
/// 1 2 0 2 0 0 1 2

/// ```
/// The type of values in indptrBuffers
indptrType: Int (required);

/// indptrBuffers stores the sparsity structure.
/// Each two consecutive dimensions in a tensor correspond to a buffer in
/// indptrBuffers. A pair of consecutive values at indptrBuffers[dim][i]
/// and indptrBuffers[dim][i + 1] signify a range of nodes in
/// indicesBuffers[dim + 1] who are children of indicesBuffers[dim][i] node.
/// indptrBuffers. A pair of consecutive values at `indptrBuffers[dim][i]`
/// and `indptrBuffers[dim][i + 1]` signify a range of nodes in
/// `indicesBuffers[dim + 1]` who are children of `indicesBuffers[dim][i]` node.
///
/// For example, the indptrBuffers for the above X is:
///
/// ```text
/// indptrBuffer(X) = [
/// [0, 2, 3],
/// [0, 1, 3, 4],
/// [0, 2, 4, 5, 8]
/// ].
///
/// ```
indptrBuffers: [Buffer] (required);

/// The type of values in indicesBuffers
Expand All @@ -180,22 +181,22 @@ table SparseTensorIndexCSF {
/// indicesBuffers stores values of nodes.
/// Each tensor dimension corresponds to a buffer in indicesBuffers.
/// For example, the indicesBuffers for the above X is:
///
/// ```text
/// indicesBuffer(X) = [
/// [0, 1],
/// [0, 1, 1],
/// [0, 0, 1, 1],
/// [1, 2, 0, 2, 0, 0, 1, 2]
/// ].
///
/// ```
indicesBuffers: [Buffer] (required);

/// axisOrder stores the sequence in which dimensions were traversed to
/// produce the prefix tree.
/// For example, the axisOrder for the above X is:
///
/// ```text
/// axisOrder(X) = [0, 1, 2, 3].
///
/// ```
axisOrder: [int] (required);
}

Expand Down

0 comments on commit dfaa215

Please sign in to comment.