Skip to content

Commit

Permalink
apacheGH-38398: [MATLAB] Improve array display (apache#38400)
Browse files Browse the repository at this point in the history
### Rationale for this change

Currently, the display for `arrow.array.Array`s is not very MATLAB-like: 

```matlab

>> a = arrow.array([1 2 3 4])

a = 

[
  1,
  2,
  3,
]
```

At the very least, the display should include the class header and indent each line by 4 spaces. Here's one display option:

```matlab

>> a = arrow.array([1 2 3 4])

a = 

  Float64Array with 4 elements and 0 null values:

    1 | 2 | 3 | 4

```

### What changes are included in this PR?

1. Array display now includes a class header, which states the array type and the number of elements/nulls in the array.  
2. Changed the [`window`](https://github.com/apache/arrow/blob/37935604bf168a3b2d52f3cc5b0edf83b5783309/cpp/src/arrow/pretty_print.h#L79C1-L80C1) size to 3 from 10.
3. Primitive and string arrays are displayed horizontally, i.e. set [`skip_new_lines`](https://github.com/apache/arrow/blob/37935604bf168a3b2d52f3cc5b0edf83b5783309/cpp/src/arrow/pretty_print.h#L90) to `false`. Uses ` | ` as the delimiter between elements with no opening/closing brackets.
4. All other array types (`struct`, `list`, etc) are displayed vertically with an `indent` of 4.

**Example String Array Display:**

```matlab
>> a = arrow.array(["Hello", missing, "Bye"])

a = 

  StringArray with 3 elements and 1 null value:

    "Hello" | null | "Bye"
```

**Example Struct Array Display:**

```matlab

>>  a1 = arrow.array(["Hello", missing, "Bye"]);
>> a2 = arrow.array([1 2 3]);
>> structArray = arrow.array.StructArray.fromArrays(a1, a2)

structArray = 

  StructArray with 3 elements and 0 null values:

        -- is_valid: all not null
    -- child 0 type: string
        [
            "Hello",
            null,
            "Bye"
        ]
    -- child 1 type: double
        [
            1,
            2,
            3
        ]
```

### Are these changes tested?

Yes. Added a new test class called `tArrayDisplay.m` with unit tests for array display.

### Are there any user-facing changes?

Yes. Users will see a different display for arrays now.

### Future Directions

1. apache#38166
* Closes: apache#38398

Lead-authored-by: Sarah Gilmore <[email protected]>
Co-authored-by: sgilmore10 <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Kevin Gurney <[email protected]>
  • Loading branch information
sgilmore10 and kou authored Oct 24, 2023
1 parent 006f387 commit e14f60b
Show file tree
Hide file tree
Showing 6 changed files with 439 additions and 3 deletions.
30 changes: 29 additions & 1 deletion matlab/src/cpp/arrow/matlab/array/proxy/array.cc
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,13 @@
#include "arrow/matlab/bit/unpack.h"
#include "arrow/matlab/error/error.h"
#include "arrow/matlab/type/proxy/wrap.h"
#include "arrow/pretty_print.h"
#include "arrow/type_traits.h"

#include "libmexclass/proxy/ProxyManager.h"

#include <sstream>

namespace arrow::matlab::array::proxy {

Array::Array(std::shared_ptr<arrow::Array> array) : array{std::move(array)} {
Expand All @@ -44,7 +47,32 @@ namespace arrow::matlab::array::proxy {

void Array::toString(libmexclass::proxy::method::Context& context) {
::matlab::data::ArrayFactory factory;
const auto str_utf8 = array->ToString();

auto opts = arrow::PrettyPrintOptions::Defaults();
opts.window = 3;
opts.indent = 4;
opts.indent_size = 4;

const auto type_id = array->type()->id();
if (arrow::is_primitive(type_id) || arrow::is_string(type_id)) {
/*
* Display primitive and string types horizontally without
* opening and closing delimiters. Use " | " as the delimiter
* between elments. Below is an example Int32Array display:
*
* 1 | 2 | 3 | ... | 6 | 7 | 8
*/
opts.skip_new_lines = true;
opts.array_delimiters.open = "";
opts.array_delimiters.close = "";
opts.array_delimiters.element = " | ";
}

std::stringstream ss;
MATLAB_ERROR_IF_NOT_OK_WITH_CONTEXT(arrow::PrettyPrint(*array, opts, &ss), context, error::ARRAY_PRETTY_PRINT_FAILED);

const auto str_utf8 = opts.skip_new_lines ? " " + ss.str() : ss.str();

MATLAB_ASSIGN_OR_ERROR_WITH_CONTEXT(const auto str_utf16, arrow::util::UTF8StringToUTF16(str_utf8), context, error::UNICODE_CONVERSION_ERROR_ID);
auto str_mda = factory.createScalar(str_utf16);
context.outputs[0] = str_mda;
Expand Down
1 change: 1 addition & 0 deletions matlab/src/cpp/arrow/matlab/error/error.h
Original file line number Diff line number Diff line change
Expand Up @@ -201,4 +201,5 @@ namespace arrow::matlab::error {
static const char* INDEX_EMPTY_CONTAINER = "arrow:index:EmptyContainer";
static const char* INDEX_OUT_OF_RANGE = "arrow:index:OutOfRange";
static const char* BUFFER_VIEW_OR_COPY_FAILED = "arrow:buffer:ViewOrCopyFailed";
static const char* ARRAY_PRETTY_PRINT_FAILED = "arrow:array:PrettyPrintFailed";
}
38 changes: 38 additions & 0 deletions matlab/src/matlab/+arrow/+array/+internal/+display/getHeader.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
%GETHEADER Generates the display header for arrow.array.Array classes

% Licensed to the Apache Software Foundation (ASF) under one or more
% contributor license agreements. See the NOTICE file distributed with
% this work for additional information regarding copyright ownership.
% The ASF licenses this file to you under the Apache License, Version
% 2.0 (the "License"); you may not use this file except in compliance
% with the License. You may obtain a copy of the License at
%
% http://www.apache.org/licenses/LICENSE-2.0
%
% Unless required by applicable law or agreed to in writing, software
% distributed under the License is distributed on an "AS IS" BASIS,
% WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
% implied. See the License for the specific language governing
% permissions and limitations under the License.

function header = getHeader(className, numElements, numNulls)
import arrow.array.internal.display.pluralizeStringIfNeeded
elementString = pluralizeStringIfNeeded(numElements, "element");

nullString = pluralizeStringIfNeeded(numNulls, "null value");

numString = "%d";
if usejava("desktop")
% Bold the number of elements and nulls if the desktop is enabled
numString = compose("<strong>%s</strong>", numString);
end

formatSpec = " %s with " + numString + " %s and " + numString + " %s";
if numElements > 0
formatSpec = formatSpec + ":";
end
formatSpec = formatSpec + newline;

header = compose(formatSpec, className, numElements, elementString, numNulls, nullString);
header = char(header);
end
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
%PLURALIZESTRINGIFNEEDED Pluralizes str if num is not equal to 1.

% Licensed to the Apache Software Foundation (ASF) under one or more
% contributor license agreements. See the NOTICE file distributed with
% this work for additional information regarding copyright ownership.
% The ASF licenses this file to you under the Apache License, Version
% 2.0 (the "License"); you may not use this file except in compliance
% with the License. You may obtain a copy of the License at
%
% http://www.apache.org/licenses/LICENSE-2.0
%
% Unless required by applicable law or agreed to in writing, software
% distributed under the License is distributed on an "AS IS" BASIS,
% WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
% implied. See the License for the specific language governing
% permissions and limitations under the License.

function str = pluralizeStringIfNeeded(num, str)
if num ~= 1
str = str + "s";
end
end

16 changes: 14 additions & 2 deletions matlab/src/matlab/+arrow/+array/Array.m
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,21 @@
end

methods (Access=protected)
function header = getHeader(obj)
name = matlab.mixin.CustomDisplay.getClassNameForHeader(obj);
numElements = obj.NumElements;
% TODO: Add NumValid and NumNull as properties to Array to
% avoid materializing the Valid property. This will improve
% performance for large arrays.
numNulls = nnz(~obj.Valid);
header = arrow.array.internal.display.getHeader(name, numElements, numNulls);
end

function displayScalarObject(obj)
disp(obj.toString());
disp(getHeader(obj));
if obj.NumElements > 0
disp(toString(obj) + newline);
end
end
end

Expand All @@ -86,4 +99,3 @@ function displayScalarObject(obj)
end
end
end

Loading

0 comments on commit e14f60b

Please sign in to comment.