-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
numpy has defined various protocol extensions over the years, and has consistently
named them with an array prefix, and "dunder" notation:
__array____array_prepare__,__array_wrap__, Subclassing ndarray__aray_ufunc__NEP13__array_function__NEP18
pandas has't been as diligent. The one example I've been dealing with, is _reduce. Originally an internal Series method (not a protocol), it now dispatches to _reduce for subclasses of ExtensionArray.
Lines 3743 to 3745 in baa77c3
| elif isinstance(delegate, ExtensionArray): | |
| # dispatch to ExtensionArray interface | |
| return delegate._reduce(name, skipna=skipna, **kwds) |
When reading the code for a new EA project, its hard to pick out that _reduce is actually an override of the parent class, instead of just an internal function written by the EA author. Something like __pandas_reduce__ would have made this clearer.
Due to inexperience with EA, I lost a bit of time figuring out why s.sum() wasn't invoking the EA implementation I thought it would.
Granted, it's not exactly a protocol. For the forseeable future, this will always be an ExtensionArray subclass, rather than a duck typed object. But, for readability,
and while EA is experimental, this might be the right time to clean it up.