Skip to content

[data] Project Operator for select_columns#48635

Merged
richardliaw merged 1 commit intoray-project:masterfrom
richardliaw:project-api
Nov 8, 2024
Merged

[data] Project Operator for select_columns#48635
richardliaw merged 1 commit intoray-project:masterfrom
richardliaw:project-api

Conversation

@richardliaw
Copy link
Contributor

Why are these changes needed?

Add Project operator to select_columns.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
Comment on lines +90 to +94
def fn(batch: "pa.Table") -> "pa.Table":
try:
return batch.select(columns)
except Exception as e:
_handle_debugger_exception(e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break for Pandas, right?

We'd add support for it here (until we drop it completely as an internal block format)

Copy link
Contributor Author

@richardliaw richardliaw Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it won't fail. We coerce the batch format into this function to be pyarrow in above -> https://github.com/ray-project/ray/pull/48635/files#diff-7aaeab5974c738dac5f137b1ba54dbb32d9fac4cfa200fe7e429ec1294863e72R251

Copy link
Contributor Author

@richardliaw richardliaw Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the code sample you think that may fail?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a batch format, so it shouldn't matter what the internal block format is, right?

@aslonnie aslonnie added the go add ONLY when ready to merge, run all tests label Nov 8, 2024
@richardliaw richardliaw merged commit 31b8b41 into ray-project:master Nov 8, 2024
JP-sDEV pushed a commit to JP-sDEV/ray that referenced this pull request Nov 14, 2024
## Why are these changes needed?

Add Project operator to select_columns.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-backlog go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants