Skip to content

Latest commit

 

History

History
230 lines (175 loc) · 9.5 KB

0.7.0-notes.rst

File metadata and controls

230 lines (175 loc) · 9.5 KB

SciPy 0.7.0 Release Notes

This is a new stable release. Please note that unlike previous versions of SciPy, this release requires Python 2.4 or greater. This release also requires NumPy 1.2.0 or greater.

Changes

Sparse Matrices

  • added support for integer dtypes such int8, uint32, etc.
  • new class dia_matrix : the sparse DIAgonal format
  • new class bsr_matrix : the Block CSR format
  • new sparse matrix construction functions
  • sparse.kron : sparse Kronecker product
  • sparse.bmat : sparse version of numpy.bmat
  • sparse.vstack : sparse version of numpy.vstack
  • sparse.hstack : sparse version of numpy.hstack
  • extraction of submatrices and nonzero values
  • sparse.tril : extract lower triangle
  • sparse.triu : extract upper triangle
  • sparse.find : nonzero values and their indices
  • csr_matrix and csc_matrix now support slicing and fancy indexing
  • e.g. A[1:3, 4:7] and A[[3,2,6,8],:]
  • conversions among all sparse formats are now possible
  • using member functions such as .tocsr() and .tolil()
  • using the .asformat() member function, e.g. A.asformat('csr')
  • using constructors A = lil_matrix([[1,2]]); B = csr_matrix(A)
  • all sparse constructors now accept dense matrices and lists of lists
  • e.g. A = csr_matrix( rand(3,3) ) and B = lil_matrix( [[1,2],[3,4]] )
  • efficiency improvements to:
  • format conversions
  • sparse matrix arithmetic
  • numerous bugfixes

Reworking of IO package

The IO code in both NumPy and SciPy is undergoing a major reworking. NumPy will be where basic code for reading and writing NumPy arrays is located, while SciPy will house file readers and writers for various data formats (data, audio, video, images, matlab, excel, etc.). This reworking started NumPy 1.1.0 and will take place over many release. SciPy 0.7.0 has several changes including:

  • many of the functions in scipy.io have been deprecated
  • the Matlab (TM) file readers/writers have a number of improvements:
  • default version 5
  • v5 writers for structures, cell arrays, and objects
  • v5 readers/writers for function handles and 64-bit integers
  • new struct_as_record keyword argument to loadmat, which loads struct arrays in matlab as record arrays in numpy
  • string arrays have dtype='U...' instead of dtype=object

New Hierarchical Clustering module

This module adds new hierarchical clustering functionality to the scipy.cluster package. The function interfaces are similar to the functions provided MATLAB(TM)'s Statistics Toolbox to help facilitate easier migration to the NumPy/SciPy framework. Linkage methods implemented include single, complete, average, weighted, centroid, median, and ward.

In addition, several functions are provided for computing inconsistency statistics, cophenetic distance, and maximum distance between descendants. The fcluster and fclusterdata functions transform a hierarchical clustering into a set of flat clusters. Since these flat clusters are generated by cutting the tree into a forest of trees, the leaders function takes a linkage and a flat clustering and finds the root of each tree in the forest. The ClusterNode class represents a hierarchical clusterings as a field-navigable tree object. to_tree converts a matrix-encoded hierarchical clustering to a ClusterNode object. Routines for converting between MATLAB and SciPy linkage encodings are provided. Finally, a dendrogram function plots hierarchical clusterings as a dendrogram using matplotlib.

New Spatial package

Collection of spatial algorithms and data structures useful for spatial statistics and clustering applications. Includes fast compiled code for computing exact and approximate nearest neighbors, as well as a pure-python kd-tree with the same interface but that supports annotation and a variety of other algorithms. The API for both modules may change somewhat as user requirements become clearer.

Also includes a distance module containing a collection of distance and dissimilarity functions for computing distances between vectors, which is useful for spatial statistics, clustering, and kd-trees. Distance and dissimilarity functions provided include Bray-Curtis, Canberra, Chebyshev, City Block, Cosine, Dice, Euclidean, Hamming, Jaccard, Kulsinski, Mahalanobis, Matching, Minkowski, Rogers-Tanimoto, Russell-Rao, Squared Euclidean, Standardized Euclidean, Sokal-Michener, Sokal-Sneath, and Yule.

The pdist function computes pairwise distance between all unordered pairs of vectors in a set of vectors. The cdist computes the distance on all pairs of vectors in the Cartesian product of two sets of vectors. Pairwise distance matrices are stored in condensed form, only the upper triangular is stored. squareform converts distance matrices between square and condensed forms.

Reworked fftpack package

FFTW2, FFTW3, MKL and DJBFFT wrappers have been removed. Only (NETLIB) fftpack remains. By focusing on one backend, we hope to add new features -- like float32 support -- more easily.

New Constants package

scipy.constants provides a collection of physical constants and conversion factors. These constants are taken from CODATA Recommended Values of the Fundamental Physical Constants: 2002. They may be found at physics.nist.gov/constants. The values are stored in the dictionary physical_constants as a tuple containing the value, the units, and the relative precision, in that order. All constants are in SI units unless otherwise stated. Several helper functions are provided.

The list is not meant to be comprehensive, but just a convenient list for everyday use.

New Radial Basis Function module

scipy.interpolate now contains a Radial Basis Function module. Radial basis functions can be used for smoothing/interpolating scattered data in n-dimensions, but should be used with caution for extrapolation outside of the observed data range.

New complex ODE integrator

scipy.integrate.ode now contains a wrapper for the ZVODE complex-valued ordinary differential equation solver (by Peter N. Brown, Alan C. Hindmarsh, and George D. Byrne).

New generalized symmetric and hermitian eigenvalue problem solver

scipy.linalg.eigh now contains wrappers for more LAPACK symmetric and hermitian eigenvalue problem solvers. Users can now solve generalized problems, select just a range of eigenvalues, and choose to use a faster algorithm at the expense of increased memory usage. The signature of the scipy.linalg.eigh changed accordingly.

Major documentation improvements

Scipy documentation is now more accessible than previously; you can view a HTML reference manual online at http://docs.scipy.org/ or download it as a PDF file. An updated tutorial is also available, and it shows how to use several essential parts of Scipy.

Nevertheless, more effort is still needed on the documentation front. Luckily, contributing to Scipy documentation is now easier than before: if you find that a part of it requires improvements, and want to help us out, please register a user name in our web-based documentation editor at http://docs.scipy.org/ and correct the issues.

Bug fixes in the interpolation package

The shape of return values from scipy.interpolate.interp1d used to be incorrect if interpolated data had more than 2 dimensions and the axis keyword was set to a non-default value. This is fixed in 0.7.0:

Users of scipy.interpolate.interp1d may need to revise their code if it relies on the incorrect behavior.

Bug fixes in the stats package

Statistical functions for masked arrays have been added and are accessible through scipy.stats.mstats. The functions are similar to their counterparts in scipy.stats but they have not yet been verified for identical interfaces and algorithms.

Several bugs were fixed for statistical functions, of those, kstest and percentileofscore gained new keyword arguments.

Added deprecation warning for mean, median, var, std, cov and corrcoef. These functions should be replaced by their numpy counterparts. Note, however, that some of the default options differ between the scipy.stats and numpy versions of these functions.

Numerous bug fixes to stats.distributions: all generic methods work now correctly, several methods in individual distributions were corrected. However, a few issues remain with higher moments (skew, kurtosis) and entropy. The maximum likelihood estimator, fit, does not work out-of-the-box for some distributions, in some cases, starting values have to be carefully chosen, in other cases, the generic implementation of the maximum likelihood method might not be the numerically appropriate estimation method.

We expect more bugfixes, increases in numerical precision and enhancements in the next release of scipy.

Running Tests

We are moving away from having our own testing framework and are adopting nose.

Building SciPy

Support for NumScons has been added. NumScons is a tentative new build system for NumPy/SciPy, using scons at its core.