fix: support system columns in dataset.take* operations#5722
fix: support system columns in dataset.take* operations#5722jackye1995 merged 17 commits intolance-format:mainfrom
Conversation
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
|
ACTION NEEDED The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. For details on the error please inspect the "PR Title Check" action. |
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
|
looks like there are quite a few CI failures, could you fix those? |
westonpace
left a comment
There was a problem hiding this comment.
This looks good but can you add a few tests? Preferably python tests.
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
westonpace
left a comment
There was a problem hiding this comment.
Awesome. Thanks for doing this. I'll merge when green.
|
I've added #5823 which describes some of the stuff we talked about externally. |
|
Looks like some legitimate test failures in the new python tests |
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
…#5722) Previously, "take*" operations did not support `_rowid`, `_rowoffset`, `_row_created_at_version`, and `_row_last_updated_at_version`. In this PR we add support for all of these columns. We preserve these system columns through the initial schema projection so that they can be used to populate the correct flags when building the `ProjectionPlan` and `PhysicalProjection` structs. - `_rowid` / `_rowaddr`: persisting these through to `ProjectionPlan` fields was enough to make them work - `_rowoffset`: required additionally (1) stripping `ROW_OFFSET` field from `ProjectionPlan` `requested_output_expr` and (2) manually injecting column using `AddRowOffsetExec` (after exposing some methods publicly) - `_row_created_at_version` / `_row_last_updated_at_version`: required piping through flags to `Fragment` readers. Closes lance-format#5615. --------- Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
Previously, "take*" operations did not support
_rowid,_rowoffset,_row_created_at_version, and_row_last_updated_at_version. In this PR we add support for all of these columns.We preserve these system columns through the initial schema projection so that they can be used to populate the correct flags when building the
ProjectionPlanandPhysicalProjectionstructs._rowid/_rowaddr: persisting these through toProjectionPlanfields was enough to make them work_rowoffset: required additionally (1) strippingROW_OFFSETfield fromProjectionPlanrequested_output_exprand (2) manually injecting column usingAddRowOffsetExec(after exposing some methods publicly)_row_created_at_version/_row_last_updated_at_version: required piping through flags toFragmentreaders.Closes #5615.