-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
pd.NaT is an instance of NaTType. NaTType inherits from _NaT, which inherits from _Timestamp, which inherits from datetime.
How would people feel about taking _Timestamp out of that chain? Note that _Timestamp is not exported, so no users should be checking e.g. isinstance(x, _Timestamp).
The motivation: if we can take _Timestamp out of the inheritance chain, then the entire NaT hierarchy along with the instance itself can be defined in its own dedicated module, with no intra-pandas dependencies other than util. This comes out to a little over 500 lines.
This opens up opportunities for simplification elsewhere, since there are a bunch of modules that import NaT but don't care about most of the rest of tslib (the same applies to the timezones functions in #17274). Of particular interest are _libs.lib, _libs.period, tseries.frequencies, and tseries.offsets.
The other upside is that with few dependencies, the module becomes much easier to test/benchmark/maintain.
Without _NaT inheriting from _Timestamp, a methods in _Timestamp become eligible for the @cython.final decorator. My (limited) understanding is that this provides a small performance improvement.
The main downside I see is that the docstring pinning done by _make_error_func, _make_nat_func, and _make_nan_func is a bit of a hassle.