Skip to content

Commit

Permalink
Merge pull request pandas-dev#9058 from jorisvandenbossche/doc-fixup-…
Browse files Browse the repository at this point in the history
…0152

DOC: fix-up docs for 0.15.2 release
  • Loading branch information
jorisvandenbossche committed Dec 11, 2014
2 parents 0a5e6c9 + c66f5ee commit 88feb4e
Show file tree
Hide file tree
Showing 3 changed files with 80 additions and 85 deletions.
4 changes: 2 additions & 2 deletions doc/source/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3403,7 +3403,7 @@ writes ``data`` to the database in batches of 1000 rows at a time:
data.to_sql('data_chunked', engine, chunksize=1000)
SQL data types
""""""""""""""
++++++++++++++

:func:`~pandas.DataFrame.to_sql` will try to map your data to an appropriate
SQL data type based on the dtype of the data. When you have columns of dtype
Expand Down Expand Up @@ -3801,7 +3801,7 @@ is lost when exporting.
Labeled data can similarly be imported from *Stata* data files as ``Categorical``
variables using the keyword argument ``convert_categoricals`` (``True`` by default).
The keyword argument ``order_categoricals`` (``True`` by default) determines
whether imported ``Categorical`` variables are ordered.
whether imported ``Categorical`` variables are ordered.

.. note::

Expand Down
4 changes: 3 additions & 1 deletion doc/source/release.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,9 @@ pandas 0.15.2

**Release date:** (December 12, 2014)

This is a minor release from 0.15.1 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes.
This is a minor release from 0.15.1 and includes a large number of bug fixes
along with several new features, enhancements, and performance improvements.
A small number of API changes were necessary to fix existing bugs.

See the :ref:`v0.15.2 Whatsnew <whatsnew_0152>` overview for an extensive list
of all API changes, enhancements and bugs that have been fixed in 0.15.2.
Expand Down
157 changes: 75 additions & 82 deletions doc/source/whatsnew/v0.15.2.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,10 @@
v0.15.2 (December 12, 2014)
---------------------------

This is a minor release from 0.15.1 and includes a small number of API changes, several new features,
enhancements, and performance improvements along with a large number of bug fixes. We recommend that all
users upgrade to this version.
This is a minor release from 0.15.1 and includes a large number of bug fixes
along with several new features, enhancements, and performance improvements.
A small number of API changes were necessary to fix existing bugs.
We recommend that all users upgrade to this version.

- :ref:`Enhancements <whatsnew_0152.enhancements>`
- :ref:`API Changes <whatsnew_0152.api>`
Expand All @@ -16,6 +17,7 @@ users upgrade to this version.

API changes
~~~~~~~~~~~

- Indexing in ``MultiIndex`` beyond lex-sort depth is now supported, though
a lexically sorted index will have a better performance. (:issue:`2646`)

Expand All @@ -38,24 +40,30 @@ API changes
df2.index.lexsort_depth
df2.loc[(1,'z')]

- Bug in concat of Series with ``category`` dtype which were coercing to ``object``. (:issue:`8641`)

- Bug in unique of Series with ``category`` dtype, which returned all categories regardless
whether they were "used" or not (see :issue:`8559` for the discussion).
Previous behaviour was to return all categories:

- ``Series.all`` and ``Series.any`` now support the ``level`` and ``skipna`` parameters. ``Series.all``, ``Series.any``, ``Index.all``, and ``Index.any`` no longer support the ``out`` and ``keepdims`` parameters, which existed for compatibility with ndarray. Various index types no longer support the ``all`` and ``any`` aggregation functions and will now raise ``TypeError``. (:issue:`8302`):
.. code-block:: python

.. ipython:: python
In [3]: cat = pd.Categorical(['a', 'b', 'a'], categories=['a', 'b', 'c'])

s = pd.Series([False, True, False], index=[0, 0, 1])
s.any(level=0)
In [4]: cat
Out[4]:
[a, b, a]
Categories (3, object): [a < b < c]

- ``Panel`` now supports the ``all`` and ``any`` aggregation functions. (:issue:`8302`):
In [5]: cat.unique()
Out[5]: array(['a', 'b', 'c'], dtype=object)

Now, only the categories that do effectively occur in the array are returned:

.. ipython:: python

p = pd.Panel(np.random.rand(2, 5, 4) > 0.1)
p.all()
cat = pd.Categorical(['a', 'b', 'a'], categories=['a', 'b', 'c'])
cat.unique()

- ``Series.all`` and ``Series.any`` now support the ``level`` and ``skipna`` parameters. ``Series.all``, ``Series.any``, ``Index.all``, and ``Index.any`` no longer support the ``out`` and ``keepdims`` parameters, which existed for compatibility with ndarray. Various index types no longer support the ``all`` and ``any`` aggregation functions and will now raise ``TypeError``. (:issue:`8302`).

- Allow equality comparisons of Series with a categorical dtype and object dtype; previously these would raise ``TypeError`` (:issue:`8938`)

Expand Down Expand Up @@ -90,25 +98,70 @@ API changes

- ``Timestamp('now')`` is now equivalent to ``Timestamp.now()`` in that it returns the local time rather than UTC. Also, ``Timestamp('today')`` is now equivalent to ``Timestamp.today()`` and both have ``tz`` as a possible argument. (:issue:`9000`)

- Fix negative step support for label-based slices (:issue:`8753`)

Old behavior:

.. code-block:: python

In [1]: s = pd.Series(np.arange(3), ['a', 'b', 'c'])
Out[1]:
a 0
b 1
c 2
dtype: int64

In [2]: s.loc['c':'a':-1]
Out[2]:
c 2
dtype: int64

New behavior:

.. ipython:: python

s = pd.Series(np.arange(3), ['a', 'b', 'c'])
s.loc['c':'a':-1]


.. _whatsnew_0152.enhancements:

Enhancements
~~~~~~~~~~~~

``Categorical`` enhancements:

- Added ability to export Categorical data to Stata (:issue:`8633`). See :ref:`here <io.stata-categorical>` for limitations of categorical variables exported to Stata data files.
- Added flag ``order_categoricals`` to ``StataReader`` and ``read_stata`` to select whether to order imported categorical data (:issue:`8836`). See :ref:`here <io.stata-categorical>` for more information on importing categorical variables from Stata data files.
- Added ability to export Categorical data to to/from HDF5 (:issue:`7621`). Queries work the same as if it was an object array. However, the ``category`` dtyped data is stored in a more efficient manner. See :ref:`here <io.hdf5-categorical>` for an example and caveats w.r.t. prior versions of pandas.
- Added support for ``searchsorted()`` on `Categorical` class (:issue:`8420`).

Other enhancements:

- Added the ability to specify the SQL type of columns when writing a DataFrame
to a database (:issue:`8778`).
For example, specifying to use the sqlalchemy ``String`` type instead of the
default ``Text`` type for string columns:

.. code-block::
.. code-block:: python

from sqlalchemy.types import String
data.to_sql('data_dtype', engine, dtype={'Col_1': String})

- Added ability to export Categorical data to Stata (:issue:`8633`). See :ref:`here <io.stata-categorical>` for limitations of categorical variables exported to Stata data files.
- Added flag ``order_categoricals`` to ``StataReader`` and ``read_stata`` to select whether to order imported categorical data (:issue:`8836`). See :ref:`here <io.stata-categorical>` for more information on importing categorical variables from Stata data files.
- Added ability to export Categorical data to to/from HDF5 (:issue:`7621`). Queries work the same as if it was an object array. However, the ``category`` dtyped data is stored in a more efficient manner. See :ref:`here <io.hdf5-categorical>` for an example and caveats w.r.t. prior versions of pandas.
- Added support for ``searchsorted()`` on `Categorical` class (:issue:`8420`).
- ``Series.all`` and ``Series.any`` now support the ``level`` and ``skipna`` parameters (:issue:`8302`):

.. ipython:: python

s = pd.Series([False, True, False], index=[0, 0, 1])
s.any(level=0)

- ``Panel`` now supports the ``all`` and ``any`` aggregation functions. (:issue:`8302`):

.. ipython:: python

p = pd.Panel(np.random.rand(2, 5, 4) > 0.1)
p.all()

- Added support for ``utcfromtimestamp()``, ``fromtimestamp()``, and ``combine()`` on `Timestamp` class (:issue:`5351`).
- Added Google Analytics (`pandas.io.ga`) basic documentation (:issue:`8835`). See :ref:`here<remote_data.ga>`.
- ``Timedelta`` arithmetic returns ``NotImplemented`` in unknown cases, allowing extensions by custom classes (:issue:`8813`).
Expand All @@ -122,19 +175,22 @@ Enhancements
- Added ability to read table footers to read_html (:issue:`8552`)
- ``to_sql`` now infers datatypes of non-NA values for columns that contain NA values and have dtype ``object`` (:issue:`8778`).


.. _whatsnew_0152.performance:

Performance
~~~~~~~~~~~
- Reduce memory usage when skiprows is an integer in read_csv (:issue:`8681`)

- Reduce memory usage when skiprows is an integer in read_csv (:issue:`8681`)
- Performance boost for ``to_datetime`` conversions with a passed ``format=``, and the ``exact=False`` (:issue:`8904`)


.. _whatsnew_0152.bug_fixes:

Bug Fixes
~~~~~~~~~

- Bug in concat of Series with ``category`` dtype which were coercing to ``object``. (:issue:`8641`)
- Bug in Timestamp-Timestamp not returning a Timedelta type and datelike-datelike ops with timezones (:issue:`8865`)
- Made consistent a timezone mismatch exception (either tz operated with None or incompatible timezone), will now return ``TypeError`` rather than ``ValueError`` (a couple of edge cases only), (:issue:`8865`)
- Bug in using a ``pd.Grouper(key=...)`` with no level/axis or level only (:issue:`8795`, :issue:`8866`)
Expand All @@ -154,95 +210,32 @@ Bug Fixes
- Bug in ``merge`` where ``how='left'`` and ``sort=False`` would not preserve left frame order (:issue:`7331`)
- Bug in ``MultiIndex.reindex`` where reindexing at level would not reorder labels (:issue:`4088`)
- Bug in certain operations with dateutil timezones, manifesting with dateutil 2.3 (:issue:`8639`)

- Fix negative step support for label-based slices (:issue:`8753`)

Old behavior:

.. code-block:: python

In [1]: s = pd.Series(np.arange(3), ['a', 'b', 'c'])
Out[1]:
a 0
b 1
c 2
dtype: int64

In [2]: s.loc['c':'a':-1]
Out[2]:
c 2
dtype: int64

New behavior:

.. ipython:: python

s = pd.Series(np.arange(3), ['a', 'b', 'c'])
s.loc['c':'a':-1]

- Regression in DatetimeIndex iteration with a Fixed/Local offset timezone (:issue:`8890`)
- Bug in ``to_datetime`` when parsing a nanoseconds using the ``%f`` format (:issue:`8989`)
- ``io.data.Options`` now raises ``RemoteDataError`` when no expiry dates are available from Yahoo and when it receives no data from Yahoo (:issue:`8761`), (:issue:`8783`).
- Fix: The font size was only set on x axis if vertical or the y axis if horizontal. (:issue:`8765`)
- Fixed division by 0 when reading big csv files in python 3 (:issue:`8621`)
- Bug in outputing a Multindex with ``to_html,index=False`` which would add an extra column (:issue:`8452`)







- Imported categorical variables from Stata files retain the ordinal information in the underlying data (:issue:`8836`).



- Defined ``.size`` attribute across ``NDFrame`` objects to provide compat with numpy >= 1.9.1; buggy with ``np.array_split`` (:issue:`8846`)


- Skip testing of histogram plots for matplotlib <= 1.2 (:issue:`8648`).






- Bug where ``get_data_google`` returned object dtypes (:issue:`3995`)

- Bug in ``DataFrame.stack(..., dropna=False)`` when the DataFrame's ``columns`` is a ``MultiIndex``
whose ``labels`` do not reference all its ``levels``. (:issue:`8844`)


- Bug in that Option context applied on ``__enter__`` (:issue:`8514`)


- Bug in resample that causes a ValueError when resampling across multiple days
and the last offset is not calculated from the start of the range (:issue:`8683`)



- Bug where ``DataFrame.plot(kind='scatter')`` fails when checking if an np.array is in the DataFrame (:issue:`8852`)



- Bug in ``pd.infer_freq/DataFrame.inferred_freq`` that prevented proper sub-daily frequency inference when the index contained DST days (:issue:`8772`).
- Bug where index name was still used when plotting a series with ``use_index=False`` (:issue:`8558`).
- Bugs when trying to stack multiple columns, when some (or all) of the level names are numbers (:issue:`8584`).
- Bug in ``MultiIndex`` where ``__contains__`` returns wrong result if index is not lexically sorted or unique (:issue:`7724`)
- BUG CSV: fix problem with trailing whitespace in skipped rows, (:issue:`8679`), (:issue:`8661`), (:issue:`8983`)
- Regression in ``Timestamp`` does not parse 'Z' zone designator for UTC (:issue:`8771`)






- Bug in `StataWriter` the produces writes strings with 244 characters irrespective of actual size (:issue:`8969`)


- Fixed ValueError raised by cummin/cummax when datetime64 Series contains NaT. (:issue:`8965`)
- Bug in Datareader returns object dtype if there are missing values (:issue:`8980`)
- Bug in plotting if sharex was enabled and index was a timeseries, would show labels on multiple axes (:issue:`3964`).

- Bug where passing a unit to the TimedeltaIndex constructor applied the to nano-second conversion twice. (:issue:`9011`).
- Bug in plotting of a period-like array (:issue:`9012`)

0 comments on commit 88feb4e

Please sign in to comment.