TimeAndDims does not implement equals/hashcode#2692
TimeAndDims does not implement equals/hashcode#2692navis wants to merge 1 commit intoapache:masterfrom
Conversation
|
👍 |
8140683 to
f7bf058
Compare
|
rebased. It's for adding eq/hash for TimeAndDims. others are just refactorings. |
There was a problem hiding this comment.
why does this need to "extend" IncrementalIndices , it seems that only contains a bunch of "static" stuff that needs to be referenced by IncrementalIndex.
There was a problem hiding this comment.
This seems odd to me too, it would make more sense to me if things were called statically like IncrementalIndices.STRING_TRANSFORMER
|
@navis any chance of addressing these comments? |
|
I am also in favor of removing the inheritance between IncrementalIndex and IncrementalIndices. 👍 aside from that |
f7bf058 to
bf2c93d
Compare
|
@fjy done |
There was a problem hiding this comment.
Two TimeAndDims might have dims of different length so I think this could cause an out of bounds array access on that.dims[i]. This also doesn't check types at all but the Comparator for sorted facts does check types.
The equals impl could do return dimsComparator.compare(this, that) == 0 although this does some needless name lookups.
There was a problem hiding this comment.
I think the type comparison in TimeAndDimsComp is actually unnecessary (the dims will have the same types up to numComparisons since the same dimension ordering is maintained), I shouldn't have added that in.
I guess the current equals/hashcode() in this PR doesn't trigger an exception since it's being used only by "no sort" IncrementalIndex during GroupBy merging where the dimension count will be the same across rows?
I think it'd be fine to reuse TimeAndDimsComp as @gianm suggested or to add a similar dims length check to equals(), if you feel like avoiding the name lookups.
|
Let me delay my previous 👍 until the comment re: comparator/equals() is addressed |
bf2c93d to
d8f7228
Compare
|
|
||
| if (timestamp != that.timestamp) { | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Should this consider two TimeAndDims equal if their dims have different length, but the longer one is all nulls? That could happen if a dimension was added later on, but not all rows actually have that dimension. Those two rows should roll up together (I think IndexMerger actually will do this but maybe it'd be nice to do it here too).
I believe currently, the behavior you have here is equivalent to what TimeAndDimsComp does. So if we want to change that here then we should probably change it in the TimeAndDimsComp too.
|
@navis could you please move the refactoring to a different PR if it is not necessary for the bug fix? |
|
@navis could you please adjust this PR to fix merge conflicts & just include the bug fix? |
Adapted from apache#2692, thanks @navis for original implementation.
Testing #2670, I've found #2571 expects TimeAndDims to be comparable(equals/hashcode) but it's not. It'll make invalid results apparently but it's missing proper test cases.