Skip to content

Commit

Permalink
more edits
Browse files Browse the repository at this point in the history
  • Loading branch information
njsmith committed Feb 25, 2014
1 parent dd8fa82 commit c13e2eb
Showing 1 changed file with 84 additions and 47 deletions.
131 changes: 84 additions & 47 deletions doc/neps/return-of-revenge-of-matmul-pep.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,35 +48,50 @@ dimensionalities described here; in particular, many will implement
only the 2d or 1d+2d subsets. But ideally whatever functionality is
available will be consistent with this.

This section uses the numpy terminology for describing arbitrary
multidimensional arrays of data. In this model, the shape of any
array is represented by a tuple of integers. Matrices have len(shape)
== 2, 1d vectors have len(shape) == 1, and scalars have shape == (),
i.e., they are "0 dimensional". Any array contains prod(shape) total
entries. Notice that prod(()) == 1 (for the same reason that sum(())
== 0); scalars are just an ordinary kind of array, not anything
special. Notice also that we distinguish between a single scalar
value (shape == (), analogous to `1`), a vector containing only a
single entry (shape == (1,), analogous to `[1]`), a matrix containing
only a single entry (shape == (1, 1), analogous to `[[1]]`), etc., so
the dimensionality of any array is always well-defined.

The recommended semantics for ``@`` are:

* 0d (scalar) inputs raise an error. Scalar * matrix multiplication
is a mathematically and algorithmically distinct operation from
matrix @ matrix multiplication; scalar * matrix multiplication
should go through ``*`` instead of ``@``.

* 1d vector inputs are promoted to 2d by appending a '1' to the shape
on the appropriate side, performing the operation, and then removing
this added dimension from the output. The result is that matrix @
vector and vector @ matrix are both legal (assuming compatible
shapes), and both return 1d vectors; vector @ vector returns a
scalar. This is clearer with examples. If ``arr(2, 3)`` represents
a 2x3 array, and ``arr(3)`` represents a 1d vector with 3 elements,
then:
* 1d vector inputs are promoted to 2d by prepending or appending a '1'
to the shape on the 'away' side, the operation is performed, and
then the added dimension is removed from the output. The result is
that matrix @ vector and vector @ matrix are both legal (assuming
compatible shapes), and both return 1d vectors; vector @ vector
returns a scalar. This is clearer with examples. If ``arr(2, 3)``
represents a 2x3 array, and ``arr(3)`` represents a 1d vector with 3
elements, then:

* ``arr(2, 3) @ arr(3, 1)`` is a regular matrix product, and returns
an array with shape (2, 1), i.e., a column vector.

* ``arr(2, 3) @ arr(3)`` performs the same computation as the
previous, but returns the result with shape (2,), i.e., a 1d
vector.
previous (i.e., treats the 1d vector as a matrix containing a
single **column**), but returns the result with shape (2,), i.e.,
a 1d vector.

* ``arr(1, 3) @ arr(3, 2)`` is a regular matrix product, and returns
an array with shape (1, 2), i.e., a row vector.

* ``arr(3) @ arr(3, 2)`` performs the same computation as the
previous, but returns the result with shape (2,), i.e., a 1d
vector.
previous (i.e., treats the 1d vector as a matrix containing a
single **row**), but returns the result with shape (2,), i.e., a
1d vector.

* ``arr(1, 3) @ arr(3, 1)`` is a regular matrix product, and returns
an array with shape (1, 1), i.e., a single value in matrix form.
Expand All @@ -91,11 +106,24 @@ The recommended semantics for ``@`` are:

* For higher dimensional inputs, we treat the last two dimensions as
being the dimensions of the matrices to multiply, and 'broadcast'
[#broadcasting] across the other dimensions. This provides a
convenient way to quickly compute many matrix products in a single
operation. For example, ``arr(10, 2, 3) @ arr(10, 3, 4)`` performs
10 separate matrix multiplies, each between a 2x3 and a 3x4 matrix,
and returns the results together in an array with shape (10, 2, 4).
across the other dimensions. This provides a convenient way to
quickly compute many matrix products in a single operation. For
example, ``arr(10, 2, 3) @ arr(10, 3, 4)`` performs 10 separate
matrix multiplies, each of which multiplies a 2x3 and a 3x4 matrix
to produce a 2x4 matrix, and then returns the 10 resulting matrices
together in an array with shape (10, 2, 4). Note that in more
complicated cases, broadcasting allows several simple but powerful
tricks for controlling how arrays are aligned; see [#broadcasting]
for details.

If one operand is >2d, and another operand is 1d, then the above
rules apply unchanged, with 1d->2d promotion performed before
broadcasting. E.g., ``arr(10, 2, 3) @ arr(3)`` first promotes to
``arr(10, 2, 3) @ arr(3, 1)``, then broadcasts and multiplies to get
an array with shape (10, 2, 1), and finally removes the added
dimension, returning an array with shape (10, 2). Similarly,
``arr(2) @ arr(10, 2, 3)`` produces an intermediate array with shape
(10, 1, 3), and a final array with shape (10, 3).

The recommended semantics for ``@@`` are::

Expand All @@ -113,9 +141,13 @@ definitions:

* scipy.sparse

* XX (try: pandas, Theano, blaze, OpenCV, cvxopt, any others?
QTransform in PyQt? PyOpenGL doesn't seem to provide a matrix
type. panda3d?)
* pandas

* blaze

* XX (try: Theano, OpenCV, cvxopt, pycuda, sage, sympy, pysparse,
pyoperators, any others? QTransform in PyQt? PyOpenGL doesn't seem
to provide a matrix type. panda3d?)


Motivation
Expand Down Expand Up @@ -376,15 +408,18 @@ operations are in each codebase.
These numerical packages together contain ~780 uses of matrix
multiplication. Within these packages, matrix multiplication is used
more heavily than most comparison operators (``<`` ``!=`` ``<=``
``>=``), and more heavily even than ``{`` and ``}``. When we include
the stdlib into our comparisons, matrix multiplication is still used
more often in total than any of the bitwise operators, and 2x as often
as ``//``. This is true even though the stdlib, which contains a fair
amount of integer arithmetic and no matrix operations, is ~4x larger
than the numeric libraries put together. While it's impossible to
know for certain, from this data it seems plausible -- even likely --
that on net across all Python code currently being written, matrix
multiplication is used more often than ``//`` or other integer
``>=``). When we include the stdlib into our comparisons, matrix
multiplication is still used more often in total than any of the
bitwise operators, and 2x as often as ``//``. This is true even
though the stdlib, which contains a fair amount of integer arithmetic
and no matrix operations, makes up more than 80% of the combined code
base. (In an interesting coincidence, the numeric libraries make up
approximately the same proportion of the 'combined' codebase as
numeric tutorials make up of PyCon 2014's tutorial schedule.)

While it's impossible to know for certain, from this data it seems
plausible that on net across all Python code currently being written,
matrix multiplication is used more often than ``//`` or other integer
operations.


Expand All @@ -404,15 +439,16 @@ duck type for all matrix-like objects.
Matrix power and in-place operators
-----------------------------------

No-one cares terribly much about the other operators proposed in this
PEP. The matrix power operator ``@@`` is useful and well-defined, but
not really necessary. It is included here for consistency: if we have
an ``@`` that is analogous to ``*``, then it would be weird and
surprising to *not* have an ``@@`` that is analogous to ``**``.
Similarly, the in-place operators ``@=`` and ``@@=`` are of marginal
utility -- it is not generally possible to implement in-place matrix
multiplication any more efficiently than by doing ``a = (a @ b)`` --
but are included for completeness and symmetry.
The primary motivation for this PEP is ``@``; no-one cares terribly
much about the other proposed operators. The matrix power operator
``@@`` is useful and well-defined, but not really necessary. It is
included here for consistency: if we have an ``@`` that is analogous
to ``*``, then it would be weird and surprising to *not* have an
``@@`` that is analogous to ``**``. Similarly, the in-place operators
``@=`` and ``@@=`` are of marginal utility -- it is not generally
possible to implement in-place matrix multiplication any more
efficiently than by doing ``a = (a @ b)`` -- but are included for
completeness and symmetry.


Compatibility considerations
Expand Down Expand Up @@ -507,22 +543,23 @@ like ``@=``).
We review some of the rejected alternatives here.

**Use a type that defines ``__mul__`` as matrix multiplication:**
Numpy has had such a type for many years: ``np.matrix``. And based on
this experience, a strong consensus has developed that it should
Numpy has had such a type for many years: ``np.matrix`` (as opposed to
the standard array type, ``np.ndarray``). And based on this
experience, a strong consensus has developed that ``np.matrix``
essentially never be used. The problem is that the presence of two
different duck-types for numeric data -- one where ``*`` means matrix
multiply, and one where ``*`` means elementwise multiplication --
makes it impossible to write generic functions that can operate on
arbitrary data. In practice, the vast majority of the Python numeric
ecosystem has standardized on using ``*`` for elementwise
multiplication, and deprecated the use of ``np.matrix``. Most
3rd-party libraries which receive a ``matrix`` as input will either
3rd-party libraries that receive a ``matrix`` as input will either
error out, return incorrect results, or simply convert the input into
a standard ``ndarray``, and return ``ndarray``s as well. The only
reason ``np.matrix`` survives is because of strong arguments from some
educators who find that its problems are outweighed by the need to
provide a simple and clear mapping between mathematical notation and
code for novices; and this, as described above, causes its own
a standard ``ndarray``, and return ``ndarray`` objects as well. The
only reason ``np.matrix`` survives is because of strong arguments from
some educators who find that its problems are outweighed by the need
to provide a simple and clear mapping between mathematical notation
and code for novices; and this, as described above, causes its own
problems.

**Add a new ``@`` (or whatever) operator that has some other meaning
Expand All @@ -544,7 +581,7 @@ extreme overabundance of parentheses. See `Motivation`_ above.
**Add lots of new operators / add a new generic syntax for defining
infix operators:** In addition to this being generally un-Pythonic and
repeatedly rejected by BDFL fiat, this would be using a sledgehammer
to smash a fly. There is a strong consensus in the scientific python
to smash a fly. There is a consensus in the scientific python
community that matrix multiplication really is the only missing infix
operator that matters enough to bother about. (In retrospect, we all
think PEP 225 was a bad idea too.)
Expand Down

0 comments on commit c13e2eb

Please sign in to comment.