Author: | Travis Oliphant |
---|---|
Discussions to: | [email protected] |
Created: | October 2005 |
The C API of NumPy is (mostly) backward compatible with Numeric.
There are a few non-standard Numeric usages (that were not really part of the API) that will need to be changed:
- If you used any of the function pointers in the
PyArray_Descr
structure you will have to modify your usage of those. First, the pointers are all under the member namedf
. Sodescr->cast
is nowdescr->f->cast
. In addition, the casting functions have eliminated the strides argument (usePyArray_CastTo
if you need strided casting). All functions have one or twoPyArrayObject *
arguments at the end. This allows the flexible arrays and mis-behaved arrays to be handled. - The
descr->zero
anddescr->one
constants have been replaced with function calls,PyArray_Zero
, andPyArray_One
(be sure to read the code and free the resulting memory if you use these calls). - If you passed
array->dimensions
andarray->strides
around to functions, you will need to fix some code. These are nownpy_intp*
pointers. On 32-bit systems there won't be a problem. However, on 64-bit systems, you will need to make changes to avoid errors and segfaults.
The header files arrayobject.h
and ufuncobject.h
contain many defines
that you may find useful. The files __ufunc_api.h
and
__multiarray_api.h
contain the available C-API function calls with
their function signatures.
All of these headers are installed to
<YOUR_PYTHON_LOCATION>/site-packages/numpy/core/include
All new arrays can be created using PyArray_NewFromDescr
. A simple interface
equivalent to PyArray_FromDims
is PyArray_SimpleNew(nd, dims, typenum)
and to PyArray_FromDimsAndData
is
PyArray_SimpleNewFromData(nd, dims, typenum, data)
.
This is a very flexible function.
PyObject * PyArray_NewFromDescr(PyTypeObject *subtype, PyArray_Descr *descr,
int nd, npy_intp *dims,
npy_intp *strides, char *data,
int flags, PyObject *obj);
subtype
:PyTypeObject *
- The subtype that should be created (either pass in
&PyArray_Type
, orobj->ob_type
, whereobj
is a an instance of a subtype (or subclass) ofPyArray_Type
). descr
:PyArray_Descr *
- The type descriptor for the array. This is a Python object (this
function steals a reference to it). The easiest way to get one is
using
PyArray_DescrFromType(<typenum>)
. If you want to use a flexible size array, then you need to usePyArray_DescrNewFromType(<flexible typenum>)
and set itselsize
parameter to the desired size. The typenum in both of these cases is one of thePyArray_XXXX
enumerated types. nd
:int
- The number of dimensions (<
MAX_DIMS
) *dims
:npy_intp *
- A pointer to the size in each dimension. Information will be copied from here.
*strides
:npy_intp *
The strides this array should have. For new arrays created by this routine, this should be
NULL
. If you pass in memory for this array to use, then you can pass in the strides information as well (otherwise it will be created for you and default to C-contiguous or Fortran contiguous). Any strides will be copied into the array structure. Do not pass in bad strides information!!!!PyArray_CheckStrides(...)
can help but you must call it if you are unsure. You cannot pass in strides information when data isNULL
and this routine is creating its own memory.*data
:char *
NULL
for creating brand-new memory. If you want this array to wrap another memory area, then pass the pointer here. You are responsible for deleting the memory in that case, but do not do so until the new array object has been deleted. The best way to handle that is to get the memory from another Python object,INCREF
that Python object after passing it's data pointer to this routine, and set the->base
member of the returned array to the Python object. You are responsible for settingPyArray_BASE(ret)
to the base object. Failure to do so will create a memory leak.If you pass in a data buffer, the
flags
argument will be the flags of the new array. If you create a new array, a non-zero flags argument indicates that you want the array to be in Fortran order.flags
:int
- Either the flags showing how to interpret the data buffer passed in, or if a new array is created, nonzero to indicate a Fortran order array. See below for an explanation of the flags.
obj
:PyObject *
- If subtypes is
&PyArray_Type
, this argument is ignored. Otherwise, the__array_finalize__
method of the subtype is called (if present) and passed this object. This is usually an array of the type to be created (so the__array_finalize__
method must handle an array argument. But, it can be anything...)
Note: The returned array object will be unitialized unless the type is
PyArray_OBJECT
in which case the memory will be set to NULL
.
PyArray_SimpleNew(nd, dims, typenum)
is a drop-in replacement for
PyArray_FromDims
(except it takes npy_intp*
dims instead of int*
dims
which matters on 64-bit systems) and it does not initialize the memory
to zero.
PyArray_SimpleNew
is just a macro for PyArray_New
with default arguments.
Use PyArray_FILLWBYTE(arr, 0)
to fill with zeros.
The PyArray_FromDims
and family of functions are still available and
are loose wrappers around this function. These functions still take
int *
arguments. This should be fine on 32-bit systems, but on 64-bit
systems you may run into trouble if you frequently passed
PyArray_FromDims
the dimensions member of the old PyArrayObject
structure
because sizeof(npy_intp) != sizeof(int)
.
PyArray_FromAny(...)
This function replaces PyArray_ContiguousFromObject
and friends (those
function calls still remain but they are loose wrappers around the
PyArray_FromAny
call).
static PyObject *
PyArray_FromAny(PyObject *op, PyArray_Descr *dtype, int min_depth,
int max_depth, int requires, PyObject *context)
op
:PyObject *
- The Python object to "convert" to an array object
dtype
:PyArray_Descr *
- The desired data-type descriptor. This can be
NULL
, if the descriptor should be determined by the object. UnlessFORCECAST
is present inflags
, this call will generate an error if the data type cannot be safely obtained from the object. min_depth
:int
- The minimum depth of array needed or 0 if doesn't matter
max_depth
:int
- The maximum depth of array allowed or 0 if doesn't matter
requires
:int
A flag indicating the "requirements" of the returned array. These are the usual ndarray flags (see NDArray flags below). In addition, there are three flags used only for the
FromAny
family of functions:ENSURECOPY
: always copy the array. Returned arrays always haveCONTIGUOUS
,ALIGNED
, andWRITEABLE
set.ENSUREARRAY
: ensure the returned array is an ndarray (or a bigndarray ifop
is one).FORCECAST
: cause a cast to occur regardless of whether or not it is safe.
context
:PyObject *
- If the Python object
op
is not an numpy array, but has an__array__
method, context is passed as the second argument to that method (the first is the typecode). Almost always this parameter isNULL
.
PyArray_ContiguousFromAny(op, typenum, min_depth, max_depth)
is
equivalent to PyArray_ContiguousFromObject(...)
(which is still
available), except it will return the subclass if op is already a
subclass of the ndarray. The ContiguousFromObject
version will
always return an ndarray (or a bigndarray).
All datatypes are handled using the PyArray_Descr *
structure.
This structure can be obtained from a Python object using
PyArray_DescrConverter
and PyArray_DescrConverter2
. The former
returns the default PyArray_LONG
descriptor when the input object
is None, while the latter returns NULL
when the input object is None
.
See the arraymethods.c
and multiarraymodule.c
files for many
examples of usage.
You should use the #defines
provided to access array structure portions:
PyArray_DATA(obj)
: returns avoid *
to the array dataPyArray_BYTES(obj)
: return achar *
to the array dataPyArray_ITEMSIZE(obj)
PyArray_NDIM(obj)
PyArray_DIMS(obj)
PyArray_DIM(obj, n)
PyArray_STRIDES(obj)
PyArray_STRIDE(obj,n)
PyArray_DESCR(obj)
PyArray_BASE(obj)
see more in arrayobject.h
The flags
attribute of the PyArrayObject
structure contains important
information about the memory used by the array (pointed to by the data member)
This flags information must be kept accurate or strange results and even
segfaults may result.
There are 6 (binary) flags that describe the memory area used by the
data buffer. These constants are defined in arrayobject.h
and
determine the bit-position of the flag. Python exposes a nice attribute-
based interface as well as a dictionary-like interface for getting
(and, if appropriate, setting) these flags.
Memory areas of all kinds can be pointed to by an ndarray, necessitating
these flags. If you get an arbitrary PyArrayObject
in C-code,
you need to be aware of the flags that are set.
If you need to guarantee a certain kind of array
(like NPY_CONTIGUOUS
and NPY_BEHAVED
), then pass these requirements into the
PyArray_FromAny function.
NPY_CONTIGUOUS
- True if the array is (C-style) contiguous in memory.
NPY_FORTRAN
- True if the array is (Fortran-style) contiguous in memory.
Notice that contiguous 1-d arrays are always both NPY_FORTRAN
contiguous
and C contiguous. Both of these flags can be checked and are convenience
flags only as whether or not an array is NPY_CONTIGUOUS
or NPY_FORTRAN
can be determined by the strides
, dimensions
, and itemsize
attributes.
NPY_OWNDATA
- True if the array owns the memory (it will try and free it using
PyDataMem_FREE()
on deallocation --- so it better really own it).
These three flags facilitate using a data pointer that is a memory-mapped array, or part of some larger record array. But, they may have other uses...
NPY_ALIGNED
- True if the data buffer is aligned for the type and the strides are multiples of the alignment factor as well. This can be checked.
NPY_WRITEABLE
- True only if the data buffer can be "written" to.
NPY_UPDATEIFCOPY
- This is a special flag that is set if this array represents a copy
made because a user required certain flags in
PyArray_FromAny
and a copy had to be made of some other array (and the user asked for this flag to be set in such a situation). The base attribute then points to the "misbehaved" array (which is set read_only). When the array with this flag set is deallocated, it will copy its contents back to the "misbehaved" array (casting if necessary) and will reset the "misbehaved" array toWRITEABLE
. If the "misbehaved" array was notWRITEABLE
to begin with thenPyArray_FromAny
would have returned an error becauseUPDATEIFCOPY
would not have been possible.
PyArray_UpdateFlags(obj, flags)
will update the obj->flags
for
flags
which can be any of NPY_CONTIGUOUS
, NPY_FORTRAN
, NPY_ALIGNED
, or
NPY_WRITEABLE
.
Some useful combinations of these flags:
NPY_BEHAVED = NPY_ALIGNED | NPY_WRITEABLE
NPY_CARRAY = NPY_DEFAULT = NPY_CONTIGUOUS | NPY_BEHAVED
NPY_CARRAY_RO = NPY_CONTIGUOUS | NPY_ALIGNED
NPY_FARRAY = NPY_FORTRAN | NPY_BEHAVED
NPY_FARRAY_RO = NPY_FORTRAN | NPY_ALIGNED
The macro PyArray_CHECKFLAGS(obj, flags)
can test any combination of flags.
There are several default combinations defined as macros already
(see arrayobject.h
)
In particular, there are ISBEHAVED
, ISBEHAVED_RO
, ISCARRAY
and ISFARRAY
macros that also check to make sure the array is in
native byte order (as determined) by the data-type descriptor.
There are more C-API enhancements which you can discover in the code, or buy the book (http://www.trelgol.com)