In working through #244 a number of inconsistencies are apparent between different subclasses of the base AbstractDataSource class. Most noticeable are:
- the
index_dimension and value_dimension are not provided by all subclasses, and even then are not used consistently. ImageData lacks them and uses a single dimension trait instead, and MultiArrayDataSource instead uses them to indicate whether the data is row or column based.
- the
get_data methods do not have a consistent signature. The correct approach here is to perhaps have some further specialization of abstract data source subclasses.
- the behaviour of bounds with respect to
nan values is inconsistent.
- some methods (particularly mask methods) are not implemented by all classes. These methods should probably assume an appropriate mask of all
True values.
The end result of this is that it is very unclear how to define an abstract data source subclass for something like a Pandas dataframe that can be used interchangeably with the standard array-based data sources.
In working through #244 a number of inconsistencies are apparent between different subclasses of the base
AbstractDataSourceclass. Most noticeable are:index_dimensionandvalue_dimensionare not provided by all subclasses, and even then are not used consistently.ImageDatalacks them and uses a singledimensiontrait instead, andMultiArrayDataSourceinstead uses them to indicate whether the data is row or column based.get_datamethods do not have a consistent signature. The correct approach here is to perhaps have some further specialization of abstract data source subclasses.nanvalues is inconsistent.Truevalues.The end result of this is that it is very unclear how to define an abstract data source subclass for something like a Pandas dataframe that can be used interchangeably with the standard array-based data sources.