Skip to content

Commit

Permalink
API: A tuple passed to DataFame.sort_index will be interpreted as the…
Browse files Browse the repository at this point in the history
… levels of

     the index, rather than requiring a list of tuple (GH4370)
  • Loading branch information
jreback committed Mar 14, 2014
1 parent 03743af commit 9a15512
Show file tree
Hide file tree
Showing 4 changed files with 37 additions and 6 deletions.
15 changes: 12 additions & 3 deletions doc/source/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1287,9 +1287,18 @@ Some other sorting notes / nuances:
* ``Series.sort`` sorts a Series by value in-place. This is to provide
compatibility with NumPy methods which expect the ``ndarray.sort``
behavior.
* ``DataFrame.sort`` takes a ``column`` argument instead of ``by``. This
method will likely be deprecated in a future release in favor of just using
``sort_index``.

Sorting by a multi-index column
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You must be explicit about sorting when the column is a multi-index, and fully specify
all levels to ``by``.

.. ipython:: python
df1.columns = MultiIndex.from_tuples([('a','one'),('a','two'),('b','three')])
df1.sort_index(by=('a','two'))
Copying
-------
Expand Down
3 changes: 3 additions & 0 deletions doc/source/release.rst
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,9 @@ API Changes
``DataFrame.stack`` operations where the name of the column index is used as
the name of the inserted column containing the pivoted data.

- A tuple passed to ``DataFame.sort_index`` will be interpreted as the levels of
the index, rather than requiring a list of tuple (:issue:`4370`)

Experimental Features
~~~~~~~~~~~~~~~~~~~~~

Expand Down
14 changes: 11 additions & 3 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -2574,8 +2574,9 @@ def sort_index(self, axis=0, by=None, ascending=True, inplace=False,
axis : {0, 1}
Sort index/rows versus columns
by : object
Column name(s) in frame. Accepts a column name or a list or tuple
for a nested sort.
Column name(s) in frame. Accepts a column name or a list
for a nested sort. A tuple will be interpreted as the
levels of a multi-index.
ascending : boolean or list, default True
Sort ascending vs. descending. Specify list for multiple sort
orders
Expand All @@ -2602,7 +2603,7 @@ def sort_index(self, axis=0, by=None, ascending=True, inplace=False,
if axis != 0:
raise ValueError('When sorting by column, axis must be 0 '
'(rows)')
if not isinstance(by, (tuple, list)):
if not isinstance(by, list):
by = [by]
if com._is_sequence(ascending) and len(by) != len(ascending):
raise ValueError('Length of ascending (%d) != length of by'
Expand All @@ -2629,6 +2630,13 @@ def trans(v):
by = by[0]
k = self[by].values
if k.ndim == 2:

# try to be helpful
if isinstance(self.columns, MultiIndex):
raise ValueError('Cannot sort by column %s in a multi-index'
' you need to explicity provide all the levels'
% str(by))

raise ValueError('Cannot sort by duplicate column %s'
% str(by))
if isinstance(ascending, (tuple, list)):
Expand Down
11 changes: 11 additions & 0 deletions pandas/tests/test_frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -9797,6 +9797,17 @@ def test_sort_index_duplicates(self):
# multi-column 'by' is separate codepath
df.sort_index(by=['a', 'b'])

# with multi-index
# GH4370
df = DataFrame(np.random.randn(4,2),columns=MultiIndex.from_tuples([('a',0),('a',1)]))
with assertRaisesRegexp(ValueError, 'levels'):
df.sort_index(by='a')

# convert tuples to a list of tuples
expected = df.sort_index(by=[('a',1)])
result = df.sort_index(by=('a',1))
assert_frame_equal(result, expected)

def test_sort_datetimes(self):

# GH 3461, argsort / lexsort differences for a datetime column
Expand Down

0 comments on commit 9a15512

Please sign in to comment.