Skip to content

Conversation

@bstadlbauer
Copy link
Contributor

@bstadlbauer bstadlbauer commented Jan 3, 2023

Sets lifetime-stagger CLI default to None (instead of 0 seconds), to allow overrides through config file or environment variables.

Closes #7444

  • Tests added / passed
  • Passes pre-commit run --all-files

@GPUtester
Copy link
Collaborator

Can one of the admins verify this patch?

Admins can comment ok to test to allow this one PR to run or add to allowlist to allow all future PRs from the same author to run.

@bstadlbauer bstadlbauer changed the title sSet lifetime-stagger default value to None Set lifetime-stagger default value to None Jan 3, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Jan 3, 2023

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

       22 files  ±  0         22 suites  ±0   10h 22m 40s ⏱️ + 57m 38s
  3 293 tests +  2    3 207 ✔️ +  3       85 💤  - 1  1 ±0 
36 154 runs  +22  34 592 ✔️ +23  1 561 💤  - 1  1 ±0 

For more details on these failures, see this check.

Results for commit 4ce53df. ± Comparison against base commit 9d8e18c.

♻️ This comment has been updated with latest results.

@fjetter
Copy link
Member

fjetter commented Jan 4, 2023

Thanks @bstadlbauer . I noticed the same problem is affecting lifetime.restart as well. Could you do the same over there?

I also would appreciate a test. For inspiration see test_dask_worker.py::test_preload_config which is setting environment variables and is passing it to the CLI process. The test assertion could then look similar to what you wrote in #7444 (comment)
See also test_worker.py::test_lifetime_stagger as an example how to assert the stagger (e.g. with a client.run)

@bstadlbauer bstadlbauer force-pushed the fix-lifetime-stagger branch from 71b4332 to d634b8a Compare January 5, 2023 09:47
@bstadlbauer bstadlbauer force-pushed the fix-lifetime-stagger branch from d634b8a to afacf2e Compare January 5, 2023 09:54
@bstadlbauer
Copy link
Contributor Author

@fjetter Thanks for checking! I've added tests (62738aa) and made sure they failed before fixing things. I've also addressed the --lifetime-restart issue.

Do you happen to know whether the currently failing tests could be related to these changes or whether those are flaky?

@bstadlbauer bstadlbauer force-pushed the fix-lifetime-stagger branch from 2c992ae to 4ce53df Compare January 5, 2023 13:06
@fjetter
Copy link
Member

fjetter commented Jan 5, 2023

We're unfortunately struggling with flaky tests frequently. We build a dashboard that helps us to see if a given test is particularly flaky, see https://dask.github.io/distributed/test_report.html and https://dask.github.io/distributed/test_short_report.html for a shorter version (see badges in README)

I see four failures on your PR

  • test_set_lifetime_stagger_via_env_var seems to be a genuine bug in your test code
  • Two unrelated failures in test_semaphore (hopefully fixed byhttps://github.com/Speed up test_semaphore.py #5885
  • test_worker_memory which is also known

I assume the commit you just pushed fixes test_set_lifetime_stagger_via_env_var in which case we're good to go once CI finishes again

@bstadlbauer
Copy link
Contributor Author

@fjetter Thank you for the quick turnaround here! 4ce53df should have fixed the test race condition I introduced, so from my end, things would be ready to merge :-)

Copy link
Member

@fjetter fjetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing this and congrats for your first commit @bstadlbauer!

@fjetter fjetter merged commit 34f2c13 into dask:main Jan 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Setting worker lifetime stagger only possible via CLI

3 participants