-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
CLN: Make ufunc works for Index #10638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
yep I think you could take them out. as |
|
No, we need this. Otherwise It's not documented, but |
|
ok let's add some tests and clean up the code eg to support the example above |
|
IIRC there is another issue where for example (#9966 with a PR #9974) : which is wrong, this needs to be created w/o slightly differently, e.g. so that the attributes get propogates AND the correct type gets inferred (e.g. this would return a maybe you want to supersede #9974 |
|
Thanks. I'll fix https://github.com/pydata/pandas/blob/master/pandas/core/index.py#L253 Am I correct that defining http://docs.scipy.org/doc/numpy/reference/arrays.classes.html |
|
This is correct -- array_wraps is what we need. On Tue, Jul 21, 2015 at 5:44 AM, Sinhrks notifications@github.com wrote:
|
|
Let me confirm 2 points:
|
|
@sinhrks both of those points look right |
|
Yes, agreed On Thu, Jul 23, 2015 at 4:14 AM, jreback notifications@github.com wrote:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't this need to go in tseries/base, e.g. ufuncs are not valid for any datetimelike?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timedelta64 and datetime64 are real numpy dtypes, so ufuncs will already choke on them:
In [1]: import pandas as pd
In [2]: i = pd.date_range('2000-01-01', periods=5)
In [3]: import numpy as np
In [4]: np.square(i)
TypeError: ufunc 'square' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
In [5]: np.square(i.to_period())
Out[5]:
PeriodIndex(['330671-10-02', '330731-10-03', '330791-10-05', '330851-10-09',
'330911-10-16'],
dtype='int64', freq='D')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that tests for these other index types would be a good idea, though.
Also, I think we should probably prohibit all ufuncs on PeriodIndex, not just those that return float. For example, it's not valid to square periods, even though that returns integers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raising TypeError in __array_wrap__ affects to arithmetic using np.array. Currently, it works:
pd.PeriodIndex(['2011', '2012'], freq='A') + np.array([1, 2])
# PeriodIndex(['2012', '2014'], dtype='int64', freq='A-DEC')
So it can't be prohibited?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PeriodIndex + integer ndarray should not be expected to work -- the array
does not have the right type!
On Mon, Aug 17, 2015 at 3:35 PM, Sinhrks notifications@github.com wrote:
In pandas/tseries/period.py
#10638 (comment):@@ -281,6 +281,16 @@ def contains(self, key):
return False
return key.ordinal in self._engine
- def array_wrap(self, result, context=None):
Raising TypeError in array_wrap affects to arithmetic using np.array.
Currently, it works:pd.PeriodIndex(['2011', '2012'], freq='A') + np.array([1, 2])
PeriodIndex(['2012', '2014'], dtype='int64', freq='A-DEC')
So it can't be prohibited?
—
Reply to this email directly or view it on GitHub
https://github.com/pydata/pandas/pull/10638/files#r37245206.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I disagree
add integers to period works and makes sense as its a freq op
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because this is a valid freq shift, np.array should work as the same.
pd.PeriodIndex(['2011', '2012'], freq='A') + 1
# PeriodIndex(['2012', '2013'], dtype='int64', freq='A-DEC')
And #10744 has been done expecting arithmetic using np.array works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reconsidered this, and defining _add_ndarray for PeriodIndex may be an alternative. I think options are:
- Always raise
TypeErrorin__array_wraps__. Supportndarrayops in another methods (PeriodIndexmust be lhs). - Raise
TypeErrorif__array_wraps__gets non-integers. Someufuncwhich returnintoutputs meaningless results (likesquare)
NOTE: Maybe not intentional, PeriodIndex.shift works for ndarray. Thus we can use shift for ndarray ops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sinhrks remember that ATM PeriodIndex is an odd duck. Its a real int64 dtyped array, that happens to have boxing. So it allows any integer-like ops (as opposed to DatetimeIndex which prohibits a lot more because its M8[ns]).
So its needs handling to error on prohibited ops. So on on raising in __array_wraps__ for prohibited ops, but this might require some dispatching mechanism to the sub-class to determine what is allowed. E.g. we do it similarly like this in the _add_numeric_ops and like routines.
1050f88 to
2757884
Compare
|
@sinhrks how's this coming along? |
4ae3ff3 to
0455ccd
Compare
pandas/tseries/period.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be TypeError, but cannot be raised from here because numpy catches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shoyer you know any better way to override the ufunc calling scheme? (e.g. as __numpy_ufunc__ doesn't exist yet). or is this best way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this sort of thing is the best we can do for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok let's add a note that eventually we could replace this with numpy_ufunc
c511a09 to
27c7e8a
Compare
|
Updated and now green. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the different exception behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TypeError raised from __array_wrap__ seems to be catchrd by numpy and return unintended result.
|
this doesn't seeem to close #9966 ? |
27c7e8a to
6a15d78
Compare
|
@jreback Standard arithmetic is handled by |
fb824f0 to
e585781
Compare
|
Failed in |
|
@sinhrks hmm I think you need to set the |
e585781 to
bd7ba08
Compare
bd7ba08 to
e244bdd
Compare
CLN: Make ufunc works for Index
|
@sinhrks thanks! |
closes #9966, #9974 (PR)
I understand these are never used because
Indexis no longer the subclass ofnp.array. Correct?allalmost ufuncsCategoricalIndex: Needs to be done separately to fixCategorical, because number of categories can be changed.TypeErrororAttributeError, as ufuncs are performed to array of tuples.