[Data][CI] Stop running all ML tests on Data premerge#60066
[Data][CI] Stop running all ML tests on Data premerge#60066aslonnie merged 8 commits intoray-project:masterfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request aims to reduce CI time for Data-only PRs by preventing ML/train tests from running unnecessarily. The approach involves modifying the test rules to trigger a new, smaller data test group for data-related file changes, and tagging specific tests as data_integration. While the overall strategy is sound, I've found a critical issue in the implementation of the new Buildkite CI step that will cause test failures. My review includes a detailed explanation and a suggested fix for this issue.
8567f7d to
9346f9b
Compare
bveeramani
left a comment
There was a problem hiding this comment.
Sorry for the delay! Just left a few comments
005ca71 to
579fbab
Compare
It's fine. I have now effected changes based on your feedback, kindly review. |
|
@DeborahOlaboye looks like there are some conflicts. Would you mind resolving them? |
Signed-off-by: DeborahOlaboye <deboraholaboye@gmail.com>
Signed-off-by: DeborahOlaboye <deboraholaboye@gmail.com>
Signed-off-by: DeborahOlaboye <deboraholaboye@gmail.com>
Signed-off-by: DeborahOlaboye <deboraholaboye@gmail.com>
579fbab to
6c7e03a
Compare
Conflicts have been resolved. Thank you for pointing that out. |
|
LGTM, ty |
| --except-tags data_integration | ||
| depends_on: [ "mlgpubuild-multipy", "forge" ] | ||
|
|
||
| - label: ":train: ml: data integration tests" |
There was a problem hiding this comment.
do we need to split by cpu/gpu?
There was a problem hiding this comment.
We originally did, but I recommended collapsing them into a single step for now for simplicity since there are only 5-ish tests.
There was a problem hiding this comment.
there are some other tests that use Ray Data (e.g. test_xgboost_trainer). by not including the tag, are you ok with the tradeoff of the tests only running in postmerge @bveeramani
There was a problem hiding this comment.
Yeah. I can ran the proposal by my team and we're aligned
) ## Description This PR reduces CI time for Data-only PRs by ensuring that changes to `python/ray/data/` no longer trigger all ML/train tests unnecessarily. ## Related issues Closes ray-project#59780 Contribution by Gittensor, learn more at https://gittensor.io/ --------- Signed-off-by: DeborahOlaboye <deboraholaboye@gmail.com> Co-authored-by: Balaji Veeramani <balaji@anyscale.com> Co-authored-by: Lonnie Liu <95255098+aslonnie@users.noreply.github.com> Signed-off-by: jinbum-kim <jinbum9958@gmail.com>
) ## Description This PR reduces CI time for Data-only PRs by ensuring that changes to `python/ray/data/` no longer trigger all ML/train tests unnecessarily. ## Related issues Closes ray-project#59780 Contribution by Gittensor, learn more at https://gittensor.io/ --------- Signed-off-by: DeborahOlaboye <deboraholaboye@gmail.com> Co-authored-by: Balaji Veeramani <balaji@anyscale.com> Co-authored-by: Lonnie Liu <95255098+aslonnie@users.noreply.github.com> Signed-off-by: 400Ping <jiekaichang@apache.org>
) ## Description This PR reduces CI time for Data-only PRs by ensuring that changes to `python/ray/data/` no longer trigger all ML/train tests unnecessarily. ## Related issues Closes ray-project#59780 Contribution by Gittensor, learn more at https://gittensor.io/ --------- Signed-off-by: DeborahOlaboye <deboraholaboye@gmail.com> Co-authored-by: Balaji Veeramani <balaji@anyscale.com> Co-authored-by: Lonnie Liu <95255098+aslonnie@users.noreply.github.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
) ## Description This PR reduces CI time for Data-only PRs by ensuring that changes to `python/ray/data/` no longer trigger all ML/train tests unnecessarily. ## Related issues Closes ray-project#59780 Contribution by Gittensor, learn more at https://gittensor.io/ --------- Signed-off-by: DeborahOlaboye <deboraholaboye@gmail.com> Co-authored-by: Balaji Veeramani <balaji@anyscale.com> Co-authored-by: Lonnie Liu <95255098+aslonnie@users.noreply.github.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
Description
This PR reduces CI time for Data-only PRs by ensuring that changes to
python/ray/data/no longer trigger all ML/train tests unnecessarily.Related issues
Closes #59780
Contribution by Gittensor, learn more at https://gittensor.io/