-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[fix][test] Fix flaky test deleteNamespaceGracefully #18220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7cb7b20 to
05212ce
Compare
| for (String tenant : admin.tenants().getTenants()) { | ||
| for (String namespace : admin.namespaces().getNamespaces(tenant)) { | ||
| deleteNamespaceGraceFullyByMultiPulsars(namespace, true, admin, pulsar, | ||
| deleteNamespaceGraceFully(namespace, true, admin, pulsar, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question, can we use the admin.namespaces().deleteNamespace() instead of this?
deleteNamespaceGraceFully looks like a hack operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When admin.namespaces().deleteNamespace() and auto create topic __change_event are concurrent executed, there is a problem #17070, and deleteNamespaceGraceFully is used to solve this problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I try to fix the flaky test this way: make delete namespace after __change_events is successfully created, but there still has another race condition: the
__change_events/__compactionasync creates and__change_eventsdelete by namespace delete. Therefore, I will disabled systemTopic in methodtestDeleteTenantto solve this flaky test.
Do you point?
If right, I suggest we should fix this first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In user use, the probability of occurrence is almost 0. Even if it actually occurs, the user can try to execute the delete namespace again. But to solve this problem fundamentally requires a big change
05212ce to
dce5599
Compare
dce5599 to
d0d9f3f
Compare
8dc781c to
d740815
Compare
| } | ||
|
|
||
| /** | ||
| * Wait until system topic "__change_event" and subscription "__compaction" are created, and then delete the namespace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the description need to be modified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already fixed.
| canPausedNamespaceService.pause(); | ||
| } | ||
|
|
||
| Awaitility.await().until(() -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can add the most wait time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already fixed.
ebec5a7 to
882cd56
Compare
|
@poorbarcode Could you fix these conflicts? |
882cd56 to
d442374
Compare
Codecov Report
@@ Coverage Diff @@
## master #18220 +/- ##
============================================
- Coverage 47.39% 47.35% -0.05%
- Complexity 10479 10483 +4
============================================
Files 698 698
Lines 68070 68070
Branches 7279 7279
============================================
- Hits 32264 32235 -29
- Misses 32228 32255 +27
- Partials 3578 3580 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
|
|
/pulsarbot rerun-failure-checks |
1 similar comment
|
/pulsarbot rerun-failure-checks |
|
/pulsarbot rerun-failure-checks |
1 similar comment
|
/pulsarbot rerun-failure-checks |
Already fixed. |
(cherry picked from commit c544ea3)
This reverts commit a3e593a.
Fixes #18232
Motivation
https://github.com/poorbarcode/pulsar/actions/runs/3267034877/jobs/5371836130
https://github.com/apache/pulsar/actions/runs/3242148385/jobs/5326408439
The original implementation determines whether to wait for the creation of
__chang_eventbased on whether a broker has taken ownership of the bundle or not. However, there are often 'set namespace policy', 'unload namespace', 'delete namespace', 'split bundle', and other operations in the test case. When thebundle unloadand the 'bundle checkconcurrently execute, the methoddeleteNamespaceGraceFully` will run unstably.Modifications
deleteNamespaceGraceFully->deleteNamespaceWithRetrydeleteNamespaceGraceFullyin classBrokerTestBase, and thenMockedPulsarServiceBaseTestcalledBrokerTestBase.deleteNamespaceGraceFully(). ButMockedPulsarServiceBaseTestis the super class ofBrokerTestBase, so move methoddeleteNamespaceGraceFullytoMockedPulsarServiceBaseTest.Documentation
docdoc-requireddoc-not-neededdoc-completeMatching PR in forked repository
PR in forked repository: