-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
Analogous to (#18808)
Adapted from tests.indexes.timedeltas.test_arithmetic.TestTimedeltaIndexArithmetic.test_timedelta_ops_with_missing_values
tdi = pd.TimedeltaIndex(['00:00:01'])
ser = pd.Series(tdi)
>>> tdi + pd.NaT
DatetimeIndex(['NaT'], dtype='datetime64[ns]', freq=None)
>>> tdi - pd.NaT
DatetimeIndex(['NaT'], dtype='datetime64[ns]', freq=None)
>>> ser + pd.NaT
0 NaT
dtype: timedelta64[ns]
>>> ser - pd.NaT
0 NaT
dtype: timedelta64[ns]
i.e. Series[timedelta64] treats pd.NaT as a timedelta, while TimedeltaIndex treats pd.NaT as a datetime (in a way that makes sense for addition but not for subtraction).
Expected Behavior
Internal consistency in the sense that:
(tdi - pd.NaT).dtype == (ser - pd.NaT).dtype
(tdi + pd.NaT).dtype == (ser + pd.NaT).dtype
For tdi - pd.NaT it only makes sense for the result to be timedelta64, so the TimedeltaIndex behavior is wrong.
For tdi + pd.NaT a reasonable argument can be made for the result being either timedelta64 or datetime64. Returning timedelta64 maintains consistency with TimedeltaIndex.__sub__, while datetime64 maintains consistency with TimedeltaIndex.__radd__. All in all I think keeping addition commutative is the more important of the two.