avoid triggering nightly tests until builds are complete by jameslamb · Pull Request #408 · NVIDIA/cuopt

jameslamb · 2025-09-22T19:54:00Z

Description

Replaces #359 (my more-complicated earlier attempt at this)

This project runs nightly builds and tests on a cron schedule:

Lines 1 to 6 in 36a6a1c

    
           name: Trigger Nightly cuOpt Pipeline 
        
           on: 
        
             workflow_dispatch: 
        
             schedule: 
        
               - cron: "0 5 * * *" # 5am UTC / 1am EST

Tests need to wait for builds to finish, and that's currently done with some shell scripts that hit the GitHub API, using a mix of sleep and polling.

This has sometimes resulted in nightly failures (network errors, timeouts, etc.). This PR proposes reducing the risk of such failures by moving that logic into GitHub Actions configuration directly, specifically:

making build.yaml trigger test.yaml with the GitHub CLI only after all package builds and publishing have finished

Issue

Contributes to #122

Notes for Reviewers

How I tested this

I manually triggered this run of the "Trigger Nightly cuOpt Pipeline": https://github.com/NVIDIA/cuopt/actions/runs/17935159871

Which triggered this build run: https://github.com/NVIDIA/cuopt/actions/runs/17935161536

Which triggered this test run: https://github.com/NVIDIA/cuopt/actions/runs/17936474025

Things look ok to me!

The test run was triggered until after all the relevant package builds and uploads were done, and BEFORE the docker image builds were done (as intended, to not be delayed waiting on them).

There are some test failures from artifact-downloading, like this:

[rapids-github-run-id] Querying the GitHub API to determine relevant run of 'build.yaml'.
Downloading and decompressing cuopt_wheel_python_cuopt_server_cu12_py312_x86_64 from Run ID 17936253863 into /tmp/tmp.pqrBXIhMlP

But I think they'll be fixed by merging #409

And the naming changes for the image builds look good 😁

copy-pr-bot · 2025-09-22T19:54:03Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

jameslamb · 2025-09-22T19:54:07Z

/ok to test

…ightly-tests-after-builds

rgsl888prabhu · 2025-09-23T15:38:50Z

If you still see the issue in doc, can you please update the link to https://docs.nvidia.com/ngc/latest/ngc-private-registry-user-guide.html#generating-a-personal-api-key

jameslamb · 2025-09-23T15:53:11Z

Ok yep will do!

And I think this new error:

ImportError: /opt/conda/envs/docs/lib/python3.13/site-packages/pylibcudf/../../../libcudf.so: undefined symbol: _ZN3rmm16cuda_stream_poolC1Em

(docs-build link)

Is a result of rapidsai/rmm#2036

It should be fixed by other RAPIDS packages being rebuilt, which @bdice triggered here: https://github.com/rapidsai/workflows/actions/runs/17949845053

jameslamb · 2025-09-23T18:05:50Z

It should be fixed by other RAPIDS packages being rebuilt

Looks like that was not enough, probably for the reasons being discussed in rapidsai/build-planning#218

This should hopefully be resolved later today when the RAPIDS Ops team deletes some nightly packages to allow new ones to be published.

jameslamb · 2025-09-24T03:28:53Z

If you still see the issue in doc, can you please update the link to https://docs.nvidia.com/ngc/latest/ngc-private-registry-user-guide.html#generating-a-personal-api-key

This did fail again 😭

https://github.com/NVIDIA/cuopt/actions/runs/17962030366/job/51090641582?pr=408

Updated those links in the way you suggested: 386dad2

jameslamb · 2025-09-24T15:11:53Z

/merge

Replaces NVIDIA#359 (my more-complicated earlier attempt at this) This project runs nightly builds and tests on a cron schedule: https://github.com/NVIDIA/cuopt/blob/36a6a1c0edf42cec2cf07c6be3f16531f33515de/.github/workflows/nightly.yaml#L1-L6 Tests need to wait for builds to finish, and that's currently done with some shell scripts that hit the GitHub API, using a mix of `sleep` and polling. This has sometimes resulted in nightly failures (network errors, timeouts, etc.). This PR proposes reducing the risk of such failures by moving that logic into GitHub Actions configuration directly, specifically: * making `build.yaml` trigger `test.yaml` with the GitHub CLI **only after all package builds and publishing have finished** ## Issue Contributes to NVIDIA#122 ## Notes for Reviewers ### How I tested this I manually triggered this run of the "Trigger Nightly cuOpt Pipeline": https://github.com/NVIDIA/cuopt/actions/runs/17935159871 Which triggered this `build` run: https://github.com/NVIDIA/cuopt/actions/runs/17935161536 Which triggered this `test` run: https://github.com/NVIDIA/cuopt/actions/runs/17936474025 Things look ok to me! The `test` run was triggered until after all the relevant package builds and uploads were done, and BEFORE the docker image builds were done (as intended, to not be delayed waiting on them). There are some test failures from artifact-downloading, like this: ```text [rapids-github-run-id] Querying the GitHub API to determine relevant run of 'build.yaml'. Downloading and decompressing cuopt_wheel_python_cuopt_server_cu12_py312_x86_64 from Run ID 17936253863 into /tmp/tmp.pqrBXIhMlP ``` But I think they'll be fixed by merging NVIDIA#409 And the naming changes for the image builds look good 😁 <img width="317" height="203" alt="image" src="https://github.com/user-attachments/assets/31bac7bd-1c4d-4c31-9ce9-9863778c2e89" /> Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Ramakrishnap (https://github.com/rgsl888prabhu) URL: NVIDIA#408

jameslamb added 3 commits September 22, 2025 14:51

avoid triggering nightly tests until builds are complete

c012a91

GH_TOKEN

00b0cec

test with cheaper workflows

c7a9841

jameslamb added the do not merge Do not merge if this flag is set label Sep 22, 2025

jameslamb added 6 commits September 22, 2025 14:58

fix flag

0378ff1

pass --ref

63c499f

fix syntax (YAML list not mapping)

71494c1

fix trigger test check

4519625

switch branch to branch-25.10

1c6a44a

missing brace

4f47bcf

jameslamb added non-breaking Introduces a non-breaking change improvement Improves an existing functionality and removed do not merge Do not merge if this flag is set labels Sep 23, 2025

jameslamb added 2 commits September 22, 2025 22:41

fix branch references

f0dee4f

Merge branch 'branch-25.10' of github.com:NVIDIA/cuopt into trigger-n…

ec336de

…ightly-tests-after-builds

jameslamb mentioned this pull request Sep 23, 2025

WIP: stabilize nightly builds + tests #359

Closed

8 tasks

jameslamb requested a review from rgsl888prabhu September 23, 2025 05:28

jameslamb changed the title ~~WIP: avoid triggering nightly tests until builds are complete~~ avoid triggering nightly tests until builds are complete Sep 23, 2025

jameslamb marked this pull request as ready for review September 23, 2025 05:29

jameslamb requested a review from a team as a code owner September 23, 2025 05:29

rgsl888prabhu approved these changes Sep 23, 2025

View reviewed changes

Merge branch 'branch-25.10' into trigger-nightly-tests-after-builds

75fd254

jameslamb mentioned this pull request Sep 23, 2025

cuopt-server: update dependencies (drop httpx, add psutil) #413

Merged

8 tasks

jameslamb added 2 commits September 23, 2025 18:51

empty commit to fully retrigger CI

5e39dd0

fix links

386dad2

jameslamb requested a review from a team as a code owner September 24, 2025 03:27

jameslamb requested a review from Iroy30 September 24, 2025 03:27

rapids-bot bot merged commit 0c69099 into branch-25.10 Sep 24, 2025
173 of 174 checks passed

jameslamb deleted the trigger-nightly-tests-after-builds branch September 24, 2025 15:12

jameslamb mentioned this pull request Sep 24, 2025

[BUG] Nightly build and testing is failing #122

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

avoid triggering nightly tests until builds are complete#408

avoid triggering nightly tests until builds are complete#408
rapids-bot[bot] merged 14 commits intobranch-25.10from
trigger-nightly-tests-after-builds

jameslamb commented Sep 22, 2025 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Sep 22, 2025

Uh oh!

jameslamb commented Sep 22, 2025

Uh oh!

rgsl888prabhu commented Sep 23, 2025

Uh oh!

jameslamb commented Sep 23, 2025

Uh oh!

jameslamb commented Sep 23, 2025

Uh oh!

jameslamb commented Sep 24, 2025

Uh oh!

jameslamb commented Sep 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	name: Trigger Nightly cuOpt Pipeline

	on:
	workflow_dispatch:
	schedule:
	- cron: "0 5 * * *" # 5am UTC / 1am EST

Conversation

jameslamb commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Issue

Notes for Reviewers

How I tested this

Uh oh!

copy-pr-bot bot commented Sep 22, 2025

Uh oh!

jameslamb commented Sep 22, 2025

Uh oh!

rgsl888prabhu commented Sep 23, 2025

Uh oh!

jameslamb commented Sep 23, 2025

Uh oh!

jameslamb commented Sep 23, 2025

Uh oh!

jameslamb commented Sep 24, 2025

Uh oh!

jameslamb commented Sep 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jameslamb commented Sep 22, 2025 •

edited

Loading