Skip to content

Commit

Permalink
Merge pull request numpy#6453 from shoyer/naive-datetime64
Browse files Browse the repository at this point in the history
API: Make datetime64 timezone naive
  • Loading branch information
charris committed Jan 17, 2016
2 parents 2f78277 + da98bbc commit 865c3e3
Show file tree
Hide file tree
Showing 12 changed files with 423 additions and 475 deletions.
41 changes: 41 additions & 0 deletions doc/release/1.11.0-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ This release supports Python 2.6 - 2.7 and 3.2 - 3.5.
Highlights
==========

* The datetime64 type is now timezone naive. See "datetime64 changes" below
for more details.

Build System Changes
====================
Expand All @@ -31,6 +33,41 @@ Future Changes
Compatibility notes
===================
datetime64 changes
~~~~~~~~~~~~~~~~~~
In prior versions of NumPy the experimental datetime64 type always stored
times in UTC. By default, creating a datetime64 object from a string or
printing it would convert from or to local time::
# old behavior
>>>> np.datetime64('2000-01-01T00:00:00')
numpy.datetime64('2000-01-01T00:00:00-0800') # note the timezone offset -08:00
A concensus of datetime64 users agreed that this behavior is undesirable
and at odds with how datetime64 is usually used (e.g., by pandas_). For
most use cases, a timezone naive datetime type is preferred, similar to the
``datetime.datetime`` type in the Python standard library. Accordingly,
datetime64 no longer assumes that input is in local time, nor does it print
local times::

>>>> np.datetime64('2000-01-01T00:00:00')
numpy.datetime64('2000-01-01T00:00:00')

For backwards compatibility, datetime64 still parses timezone offsets, which
it handles by converting to UTC. However, the resulting datetime is timezone
naive::

>>> np.datetime64('2000-01-01T00:00:00-08')
DeprecationWarning: parsing timezone aware datetimes is deprecated; this will raise an error in the future
numpy.datetime64('2000-01-01T08:00:00')

As a corollary to this change, we no longer prohibit casting between datetimes
with date units and datetimes with timeunits. With timezone naive datetimes,
the rule for casting from dates to times is no longer ambiguous.

pandas_: http://pandas.pydata.org

DeprecationWarning to error
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -163,6 +200,10 @@ extended to ``@``, ``numpy.dot``, ``numpy.inner``, and ``numpy.matmul``.

**Note:** Requires the transposed and non-transposed matrices to share data.

*np.testing.assert_warns* can now be used as a context manager
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This matches the behavior of ``assert_raises``.

Changes
=======
Pyrex support was removed from ``numpy.distutils``. The method
Expand Down
62 changes: 37 additions & 25 deletions doc/source/reference/arrays.datetime.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,16 +45,10 @@ some additional SI-prefix seconds-based units.
>>> np.datetime64('2005-02', 'D')
numpy.datetime64('2005-02-01')

Using UTC "Zulu" time:

>>> np.datetime64('2005-02-25T03:30Z')
numpy.datetime64('2005-02-24T21:30-0600')

ISO 8601 specifies to use the local time zone
if none is explicitly given:
From a date and time:

>>> np.datetime64('2005-02-25T03:30')
numpy.datetime64('2005-02-25T03:30-0600')
numpy.datetime64('2005-02-25T03:30')

When creating an array of datetimes from a string, it is still possible
to automatically select the unit from the inputs, by using the
Expand Down Expand Up @@ -100,23 +94,6 @@ because the moment of time is still being represented exactly.
>>> np.datetime64('2010-03-14T15Z') == np.datetime64('2010-03-14T15:00:00.00Z')
True

An important exception to this rule is between datetimes with
:ref:`date units <arrays.dtypes.dateunits>` and datetimes with
:ref:`time units <arrays.dtypes.timeunits>`. This is because this kind
of conversion generally requires a choice of timezone and
particular time of day on the given date.

.. admonition:: Example

>>> np.datetime64('2003-12-25', 's')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Cannot parse "2003-12-25" as unit 's' using casting rule 'same_kind'

>>> np.datetime64('2003-12-25') == np.datetime64('2003-12-25T00Z')
False


Datetime and Timedelta Arithmetic
=================================

Expand Down Expand Up @@ -353,6 +330,41 @@ Some examples::
# any amount of whitespace is allowed; abbreviations are case-sensitive.
weekmask = "MonTue Wed Thu\tFri"

Changes with NumPy 1.11
=======================

In prior versions of NumPy, the datetime64 type always stored
times in UTC. By default, creating a datetime64 object from a string or
printing it would convert from or to local time::

# old behavior
>>>> np.datetime64('2000-01-01T00:00:00')
numpy.datetime64('2000-01-01T00:00:00-0800') # note the timezone offset -08:00

A concensus of datetime64 users agreed that this behavior is undesirable
and at odds with how datetime64 is usually used (e.g., by pandas_). For
most use cases, a timezone naive datetime type is preferred, similar to the
``datetime.datetime`` type in the Python standard library. Accordingly,
datetime64 no longer assumes that input is in local time, nor does it print
local times::

>>>> np.datetime64('2000-01-01T00:00:00')
numpy.datetime64('2000-01-01T00:00:00')

For backwards compatibility, datetime64 still parses timezone offsets, which
it handles by converting to UTC. However, the resulting datetime is timezone
naive::

>>> np.datetime64('2000-01-01T00:00:00-08')
DeprecationWarning: parsing timezone aware datetimes is deprecated; this will raise an error in the future
numpy.datetime64('2000-01-01T08:00:00')

As a corollary to this change, we no longer prohibit casting between datetimes
with date units and datetimes with timeunits. With timezone naive datetimes,
the rule for casting from dates to times is no longer ambiguous.

pandas_: http://pandas.pydata.org

Differences Between 1.6 and 1.7 Datetimes
=========================================

Expand Down
14 changes: 4 additions & 10 deletions numpy/core/arrayprint.py
Original file line number Diff line number Diff line change
Expand Up @@ -708,25 +708,19 @@ def __call__(self, x):
i = i + 'j'
return r + i


class DatetimeFormat(object):
def __init__(self, x, unit=None,
timezone=None, casting='same_kind'):
def __init__(self, x, unit=None, timezone=None, casting='same_kind'):
# Get the unit from the dtype
if unit is None:
if x.dtype.kind == 'M':
unit = datetime_data(x.dtype)[0]
else:
unit = 's'

# If timezone is default, make it 'local' or 'UTC' based on the unit
if timezone is None:
# Date units -> UTC, time units -> local
if unit in ('Y', 'M', 'W', 'D'):
self.timezone = 'UTC'
else:
self.timezone = 'local'
else:
self.timezone = timezone
timezone = 'naive'
self.timezone = timezone
self.unit = unit
self.casting = casting

Expand Down
32 changes: 16 additions & 16 deletions numpy/core/src/multiarray/datetime.c
Original file line number Diff line number Diff line change
Expand Up @@ -1316,9 +1316,6 @@ datetime_metadata_divides(

/*
* This provides the casting rules for the DATETIME data type units.
*
* Notably, there is a barrier between 'date units' and 'time units'
* for all but 'unsafe' casting.
*/
NPY_NO_EXPORT npy_bool
can_cast_datetime64_units(NPY_DATETIMEUNIT src_unit,
Expand All @@ -1331,31 +1328,26 @@ can_cast_datetime64_units(NPY_DATETIMEUNIT src_unit,
return 1;

/*
* Only enforce the 'date units' vs 'time units' barrier with
* 'same_kind' casting.
* Can cast between all units with 'same_kind' casting.
*/
case NPY_SAME_KIND_CASTING:
if (src_unit == NPY_FR_GENERIC || dst_unit == NPY_FR_GENERIC) {
return src_unit == NPY_FR_GENERIC;
}
else {
return (src_unit <= NPY_FR_D && dst_unit <= NPY_FR_D) ||
(src_unit > NPY_FR_D && dst_unit > NPY_FR_D);
return 1;
}

/*
* Enforce the 'date units' vs 'time units' barrier and that
* casting is only allowed towards more precise units with
* 'safe' casting.
* Casting is only allowed towards more precise units with 'safe'
* casting.
*/
case NPY_SAFE_CASTING:
if (src_unit == NPY_FR_GENERIC || dst_unit == NPY_FR_GENERIC) {
return src_unit == NPY_FR_GENERIC;
}
else {
return (src_unit <= dst_unit) &&
((src_unit <= NPY_FR_D && dst_unit <= NPY_FR_D) ||
(src_unit > NPY_FR_D && dst_unit > NPY_FR_D));
return (src_unit <= dst_unit);
}

/* Enforce equality with 'no' or 'equiv' casting */
Expand Down Expand Up @@ -2254,6 +2246,14 @@ convert_pydatetime_to_datetimestruct(PyObject *obj, npy_datetimestruct *out,
PyObject *offset;
int seconds_offset, minutes_offset;

/* 2016-01-14, 1.11 */
PyErr_Clear();
if (DEPRECATE(
"parsing timezone aware datetimes is deprecated; "
"this will raise an error in the future") < 0) {
return -1;
}

/* The utcoffset function should return a timedelta */
offset = PyObject_CallMethod(tmp, "utcoffset", "O", obj);
if (offset == NULL) {
Expand Down Expand Up @@ -2386,7 +2386,7 @@ convert_pyobject_to_datetime(PyArray_DatetimeMetaData *meta, PyObject *obj,

/* Parse the ISO date */
if (parse_iso_8601_datetime(str, len, meta->base, casting,
&dts, NULL, &bestunit, NULL) < 0) {
&dts, &bestunit, NULL) < 0) {
Py_DECREF(bytes);
return -1;
}
Expand Down Expand Up @@ -3500,7 +3500,7 @@ find_string_array_datetime64_type(PyArrayObject *arr,

tmp_meta.base = -1;
if (parse_iso_8601_datetime(tmp_buffer, maxlen, -1,
NPY_UNSAFE_CASTING, &dts, NULL,
NPY_UNSAFE_CASTING, &dts,
&tmp_meta.base, NULL) < 0) {
goto fail;
}
Expand All @@ -3509,7 +3509,7 @@ find_string_array_datetime64_type(PyArrayObject *arr,
else {
tmp_meta.base = -1;
if (parse_iso_8601_datetime(data, tmp - data, -1,
NPY_UNSAFE_CASTING, &dts, NULL,
NPY_UNSAFE_CASTING, &dts,
&tmp_meta.base, NULL) < 0) {
goto fail;
}
Expand Down
Loading

0 comments on commit 865c3e3

Please sign in to comment.