Skip to content

Commit

Permalink
Merge pull request numpy#5482 from ahaldane/recordarray_doc
Browse files Browse the repository at this point in the history
DOC: improve record/structured array nomenclature & guide
  • Loading branch information
charris committed Jan 22, 2015
2 parents 723a1ee + 1bd0b4e commit 5b714c7
Show file tree
Hide file tree
Showing 16 changed files with 179 additions and 92 deletions.
2 changes: 1 addition & 1 deletion doc/source/reference/arrays.classes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -337,7 +337,7 @@ Record arrays (:mod:`numpy.rec`)
:ref:`arrays.dtypes`.

Numpy provides the :class:`recarray` class which allows accessing the
fields of a record/structured array as attributes, and a corresponding
fields of a structured array as attributes, and a corresponding
scalar data type object :class:`record`.

.. currentmodule:: numpy
Expand Down
30 changes: 15 additions & 15 deletions doc/source/reference/arrays.dtypes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,12 @@ following aspects of the data:
1. Type of the data (integer, float, Python object, etc.)
2. Size of the data (how many bytes is in *e.g.* the integer)
3. Byte order of the data (:term:`little-endian` or :term:`big-endian`)
4. If the data type is a :term:`record`, an aggregate of other
4. If the data type is :term:`structured`, an aggregate of other
data types, (*e.g.*, describing an array item consisting of
an integer and a float),

1. what are the names of the ":term:`fields <field>`" of the record,
by which they can be :ref:`accessed <arrays.indexing.rec>`,
1. what are the names of the ":term:`fields <field>`" of the structure,
by which they can be :ref:`accessed <arrays.indexing.fields>`,
2. what is the data-type of each :term:`field`, and
3. which part of the memory block each field takes.

Expand All @@ -40,15 +40,14 @@ needed in Numpy.

.. index::
pair: dtype; field
pair: dtype; record

Struct data types are formed by creating a data type whose
Structured data types are formed by creating a data type whose
:term:`fields` contain other data types. Each field has a name by
which it can be :ref:`accessed <arrays.indexing.rec>`. The parent data
which it can be :ref:`accessed <arrays.indexing.fields>`. The parent data
type should be of sufficient size to contain all its fields; the
parent is nearly always based on the :class:`void` type which allows
an arbitrary item size. Struct data types may also contain nested struct
sub-array data types in their fields.
an arbitrary item size. Structured data types may also contain nested
structured sub-array data types in their fields.

.. index::
pair: dtype; sub-array
Expand All @@ -60,7 +59,7 @@ fixed size.
If an array is created using a data-type describing a sub-array,
the dimensions of the sub-array are appended to the shape
of the array when the array is created. Sub-arrays in a field of a
record behave differently, see :ref:`arrays.indexing.rec`.
structured type behave differently, see :ref:`arrays.indexing.fields`.

Sub-arrays always have a C-contiguous memory layout.

Expand All @@ -83,7 +82,7 @@ Sub-arrays always have a C-contiguous memory layout.

.. admonition:: Example

A record data type containing a 16-character string (in field 'name')
A structured data type containing a 16-character string (in field 'name')
and a sub-array of two 64-bit floating-point number (in field 'grades'):

>>> dt = np.dtype([('name', np.str_, 16), ('grades', np.float64, (2,))])
Expand Down Expand Up @@ -246,8 +245,8 @@ Array-protocol type strings (see :ref:`arrays.interface`)

String with comma-separated fields

Numarray introduced a short-hand notation for specifying the format
of a record as a comma-separated string of basic formats.
A short-hand notation for specifying the format of a structured data type is
a comma-separated string of basic formats.

A basic format in this context is an optional shape specifier
followed by an array-protocol type string. Parenthesis are required
Expand Down Expand Up @@ -315,7 +314,7 @@ Type strings

>>> dt = np.dtype((np.int32, (2,2))) # 2 x 2 integer sub-array
>>> dt = np.dtype(('S10', 1)) # 10-character string
>>> dt = np.dtype(('i4, (2,3)f8, f4', (2,3))) # 2 x 3 record sub-array
>>> dt = np.dtype(('i4, (2,3)f8, f4', (2,3))) # 2 x 3 structured sub-array

.. index::
triple: dtype; construction; from list
Expand Down Expand Up @@ -432,7 +431,8 @@ Type strings
Both arguments must be convertible to data-type objects in this
case. The *base_dtype* is the data-type object that the new
data-type builds on. This is how you could assign named fields to
any built-in data-type object.
any built-in data-type object, as done in
:ref:`record arrays <arrays.classes.rec>`.

.. admonition:: Example

Expand Down Expand Up @@ -486,7 +486,7 @@ Endianness of this data:

dtype.byteorder

Information about sub-data-types in a :term:`record`:
Information about sub-data-types in a :term:`structured` data type:

.. autosummary::
:toctree: generated/
Expand Down
18 changes: 9 additions & 9 deletions doc/source/reference/arrays.indexing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Indexing

:class:`ndarrays <ndarray>` can be indexed using the standard Python
``x[obj]`` syntax, where *x* is the array and *obj* the selection.
There are three kinds of indexing available: record access, basic
There are three kinds of indexing available: field access, basic
slicing, advanced indexing. Which one occurs depends on *obj*.

.. note::
Expand Down Expand Up @@ -489,25 +489,25 @@ indexing (in no particular order):
view on the data. This *must* be done if the subclasses ``__getitem__`` does
not return views.

.. _arrays.indexing.rec:
.. _arrays.indexing.fields:


Record Access
Field Access
-------------

.. seealso:: :ref:`arrays.dtypes`, :ref:`arrays.scalars`

If the :class:`ndarray` object is a record array, *i.e.* its data type
is a :term:`record` data type, the :term:`fields <field>` of the array
can be accessed by indexing the array with strings, dictionary-like.
If the :class:`ndarray` object is a structured array the :term:`fields <field>`
of the array can be accessed by indexing the array with strings,
dictionary-like.

Indexing ``x['field-name']`` returns a new :term:`view` to the array,
which is of the same shape as *x* (except when the field is a
sub-array) but of data type ``x.dtype['field-name']`` and contains
only the part of the data in the specified field. Also record array
scalars can be "indexed" this way.
only the part of the data in the specified field. Also
:ref:`record array <arrays.classes.rec>` scalars can be "indexed" this way.

Indexing into a record array can also be done with a list of field names,
Indexing into a structured array can also be done with a list of field names,
*e.g.* ``x[['field-name1','field-name2']]``. Currently this returns a new
array containing a copy of the values in the fields specified in the list.
As of NumPy 1.7, returning a copy is being deprecated in favor of returning
Expand Down
11 changes: 6 additions & 5 deletions doc/source/reference/arrays.interface.rst
Original file line number Diff line number Diff line change
Expand Up @@ -103,19 +103,19 @@ This approach to the interface consists of the object having an
not a requirement. The only requirement is that the number of
bytes represented in the *typestr* key is the same as the total
number of bytes represented here. The idea is to support
descriptions of C-like structs (records) that make up array
descriptions of C-like structs that make up array
elements. The elements of each tuple in the list are

1. A string providing a name associated with this portion of
the record. This could also be a tuple of ``('full name',
the datatype. This could also be a tuple of ``('full name',
'basic_name')`` where basic name would be a valid Python
variable name representing the full name of the field.

2. Either a basic-type description string as in *typestr* or
another list (for nested records)
another list (for nested structured types)

3. An optional shape tuple providing how many times this part
of the record should be repeated. No repeats are assumed
of the structure should be repeated. No repeats are assumed
if this is not given. Very complicated structures can be
described using this generic interface. Notice, however,
that each element of the array is still of the same
Expand Down Expand Up @@ -301,7 +301,8 @@ more information which may be important for various applications::
typestr == '|V16'
descr == [('ival','>i4'),('','|V4'),('dval','>f8')]

It should be clear that any record type could be described using this interface.
It should be clear that any structured type could be described using this
interface.

Differences with Array interface (Version 2)
============================================
Expand Down
2 changes: 1 addition & 1 deletion doc/source/reference/arrays.ndarray.rst
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ Indexing arrays

Arrays can be indexed using an extended Python slicing syntax,
``array[selection]``. Similar syntax is also used for accessing
fields in a :ref:`record array <arrays.dtypes>`.
fields in a :ref:`structured array <arrays.dtypes.field>`.

.. seealso:: :ref:`Array Indexing <arrays.indexing>`.

Expand Down
16 changes: 8 additions & 8 deletions doc/source/reference/arrays.scalars.rst
Original file line number Diff line number Diff line change
Expand Up @@ -250,7 +250,7 @@ array scalar,

- ``x[()]`` returns a 0-dimensional :class:`ndarray`
- ``x['field-name']`` returns the array scalar in the field *field-name*.
(*x* can have fields, for example, when it corresponds to a record data type.)
(*x* can have fields, for example, when it corresponds to a structured data type.)

Methods
=======
Expand Down Expand Up @@ -282,10 +282,10 @@ Defining new types
==================

There are two ways to effectively define a new array scalar type
(apart from composing record :ref:`dtypes <arrays.dtypes>` from the built-in
scalar types): One way is to simply subclass the :class:`ndarray` and
overwrite the methods of interest. This will work to a degree, but
internally certain behaviors are fixed by the data type of the array.
To fully customize the data type of an array you need to define a new
data-type, and register it with NumPy. Such new types can only be
defined in C, using the :ref:`Numpy C-API <c-api>`.
(apart from composing structured types :ref:`dtypes <arrays.dtypes>` from
the built-in scalar types): One way is to simply subclass the
:class:`ndarray` and overwrite the methods of interest. This will work to
a degree, but internally certain behaviors are fixed by the data type of
the array. To fully customize the data type of an array you need to
define a new data-type, and register it with NumPy. Such new types can only
be defined in C, using the :ref:`Numpy C-API <c-api>`.
14 changes: 7 additions & 7 deletions doc/source/reference/c-api.array.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1240,9 +1240,9 @@ Special functions for NPY_OBJECT

A function to INCREF all the objects at the location *ptr*
according to the data-type *dtype*. If *ptr* is the start of a
record with an object at any offset, then this will (recursively)
structured type with an object at any offset, then this will (recursively)
increment the reference count of all object-like items in the
record.
structured type.

.. cfunction:: int PyArray_XDECREF(PyArrayObject* op)

Expand All @@ -1253,7 +1253,7 @@ Special functions for NPY_OBJECT

.. cfunction:: void PyArray_Item_XDECREF(char* ptr, PyArray_Descr* dtype)

A function to XDECREF all the object-like items at the loacation
A function to XDECREF all the object-like items at the location
*ptr* as recorded in the data-type, *dtype*. This works
recursively so that if ``dtype`` itself has fields with data-types
that contain object-like items, all the object-like fields will be
Expand Down Expand Up @@ -1540,7 +1540,7 @@ Conversion
itemsize of the new array type must be less than *self*
->descr->elsize or an error is raised. The same shape and strides
as the original array are used. Therefore, this function has the
effect of returning a field from a record array. But, it can also
effect of returning a field from a structured array. But, it can also
be used to select specific bytes or groups of bytes from any array
type.

Expand Down Expand Up @@ -1786,7 +1786,7 @@ Item selection and manipulation
->descr is a data-type with fields defined, then
self->descr->names is used to determine the sort order. A
comparison where the first field is equal will use the second
field and so on. To alter the sort order of a record array, create
field and so on. To alter the sort order of a structured array, create
a new data-type with a different order of names and construct a
view of the array with that new data-type.

Expand All @@ -1805,7 +1805,7 @@ Item selection and manipulation
to understand the order the *sort_keys* must be in (reversed from
the order you would use when comparing two elements).

If these arrays are all collected in a record array, then
If these arrays are all collected in a structured array, then
:cfunc:`PyArray_Sort` (...) can also be used to sort the array
directly.

Expand Down Expand Up @@ -1838,7 +1838,7 @@ Item selection and manipulation
If *self*->descr is a data-type with fields defined, then
self->descr->names is used to determine the sort order. A comparison where
the first field is equal will use the second field and so on. To alter the
sort order of a record array, create a new data-type with a different
sort order of a structured array, create a new data-type with a different
order of names and construct a view of the array with that new data-type.
Returns zero on success and -1 on failure.

Expand Down
6 changes: 3 additions & 3 deletions doc/source/reference/internals.code-explanations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,9 +74,9 @@ optimizations that by-pass this mechanism, but the point of the data-
type abstraction is to allow new data-types to be added.

One of the built-in data-types, the void data-type allows for
arbitrary records containing 1 or more fields as elements of the
arbitrary structured types containing 1 or more fields as elements of the
array. A field is simply another data-type object along with an offset
into the current record. In order to support arbitrarily nested
into the current structured type. In order to support arbitrarily nested
fields, several recursive implementations of data-type access are
implemented for the void type. A common idiom is to cycle through the
elements of the dictionary and perform a specific operation based on
Expand Down Expand Up @@ -184,7 +184,7 @@ The array scalars also offer the same methods and attributes as arrays
with the intent that the same code can be used to support arbitrary
dimensions (including 0-dimensions). The array scalars are read-only
(immutable) with the exception of the void scalar which can also be
written to so that record-array field setting works more naturally
written to so that structured array field setting works more naturally
(a[0]['f1'] = ``value`` ).


Expand Down
6 changes: 3 additions & 3 deletions doc/source/user/basics.rec.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _structured_arrays:

***************************************
Structured arrays (aka "Record arrays")
***************************************
*****************
Structured arrays
*****************

.. automodule:: numpy.doc.structured_arrays
9 changes: 5 additions & 4 deletions numpy/add_newdocs.py
Original file line number Diff line number Diff line change
Expand Up @@ -4629,7 +4629,7 @@ def luf(lamdaexpr, *args, **kwargs):
>>> print x
[(1, 20) (3, 4)]
Using a view to convert an array to a record array:
Using a view to convert an array to a recarray:
>>> z = x.view(np.recarray)
>>> z.a
Expand Down Expand Up @@ -5875,17 +5875,18 @@ def luf(lamdaexpr, *args, **kwargs):
>>> np.dtype(np.int16)
dtype('int16')
Record, one field name 'f1', containing int16:
Structured type, one field name 'f1', containing int16:
>>> np.dtype([('f1', np.int16)])
dtype([('f1', '<i2')])
Record, one field named 'f1', in itself containing a record with one field:
Structured type, one field named 'f1', in itself containing a structured
type with one field:
>>> np.dtype([('f1', [('f1', np.int16)])])
dtype([('f1', [('f1', '<i2')])])
Record, two fields: the first field contains an unsigned int, the
Structured type, two fields: the first field contains an unsigned int, the
second an int32:
>>> np.dtype([('f1', np.uint), ('f2', np.int32)])
Expand Down
8 changes: 4 additions & 4 deletions numpy/core/records.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
=============
Record arrays expose the fields of structured arrays as properties.
Most commonly, ndarrays contain elements of a single type, e.g. floats, integers,
bools etc. However, it is possible for elements to be combinations of these,
such as::
Most commonly, ndarrays contain elements of a single type, e.g. floats,
integers, bools etc. However, it is possible for elements to be combinations
of these using structured types, such as::
>>> a = np.array([(1, 2.0), (1, 2.0)], dtype=[('x', int), ('y', float)])
>>> a
Expand All @@ -25,7 +25,7 @@
Record arrays allow us to access fields as properties::
>>> ar = a.view(np.recarray)
>>> ar = np.rec.array(a)
>>> ar.x
array([1, 1])
Expand Down
2 changes: 1 addition & 1 deletion numpy/core/src/multiarray/arrayobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -732,7 +732,7 @@ array_might_be_written(PyArrayObject *obj)
{
const char *msg =
"Numpy has detected that you (may be) writing to an array returned\n"
"by numpy.diagonal or by selecting multiple fields in a record\n"
"by numpy.diagonal or by selecting multiple fields in a structured\n"
"array. This code will likely break in a future numpy release --\n"
"see numpy.diagonal or arrays.indexing reference docs for details.\n"
"The quick fix is to make an explicit copy (e.g., do\n"
Expand Down
2 changes: 1 addition & 1 deletion numpy/doc/creation.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
This section will not cover means of replicating, joining, or otherwise
expanding or mutating existing arrays. Nor will it cover creating object
arrays or record arrays. Both of those are covered in their own sections.
arrays or structured arrays. Both of those are covered in their own sections.
Converting Python array_like Objects to Numpy Arrays
====================================================
Expand Down
Loading

0 comments on commit 5b714c7

Please sign in to comment.