[Core][nightly-test] better way of calculating num features by scv119 · Pull Request #22158 · ray-project/ray

scv119 · 2022-02-07T02:22:43Z

Why are these changes needed?

previously, we use num_columns - 2 to calculate the feature size where we assumes both the "label" and arrow internal column __index_level_0__ exists in the schema.

however, the __index_level_0__ might not exist in the dataset's schema, which causes this check fail.
in this pr, we use a better way to filter out arrow internal column and label column.

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests https://console.anyscale.com/o/anyscale-internal/projects/prj_2xR6uT6t7jJuu1aCwWMsle/clusters/ses_xhpjT94JnGDwf9PnUjw78DJe
- This PR is not tested :(

release/nightly_tests/dataset/ray_sgd_training.py

…ect#22158) * better filter of column length * address comments * more

better filter of column length

0295be8

scv119 assigned rkooo567, matthewdeng and jjyao Feb 7, 2022

matthewdeng approved these changes Feb 7, 2022

View reviewed changes

release/nightly_tests/dataset/ray_sgd_training.py Outdated Show resolved Hide resolved

release/nightly_tests/dataset/ray_sgd_training.py Outdated Show resolved Hide resolved

address comments

cc03abd

rkooo567 approved these changes Feb 7, 2022

View reviewed changes

more

8a8eebb

scv119 added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Feb 7, 2022

scv119 merged commit 1381930 into ray-project:master Feb 7, 2022

scv119 mentioned this pull request Feb 17, 2022

[Bug] datasets_ingest_train_infer is flaky in the master #21812

Closed

2 tasks

simonsays1980 pushed a commit to simonsays1980/ray that referenced this pull request Feb 27, 2022

[Core][nightly-test] better way of calculating num features (ray-proj…

287168f

…ect#22158) * better filter of column length * address comments * more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Core][nightly-test] better way of calculating num features#22158

[Core][nightly-test] better way of calculating num features#22158
scv119 merged 3 commits intoray-project:masterfrom
scv119:better-filter

scv119 commented Feb 7, 2022 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

scv119 commented Feb 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

scv119 commented Feb 7, 2022 •

edited

Loading