Skip to content

MINOR: Less restrictive assertion in flaky BufferPool test#5799

Merged
ijuma merged 4 commits intoapache:trunkfrom
colinhicks:fix-buffer-pool-test-flakiness
Oct 21, 2018
Merged

MINOR: Less restrictive assertion in flaky BufferPool test#5799
ijuma merged 4 commits intoapache:trunkfrom
colinhicks:fix-buffer-pool-test-flakiness

Conversation

@colinhicks
Copy link
Copy Markdown
Contributor

@colinhicks colinhicks commented Oct 15, 2018

We are routinely seeing CI failures from an assertion in testBlockTimeout(), which relies on the interaction between (de)allocations against a producer BufferPool, across multiple threads. I reproduced the failed assertion in my local build at a rate of 2% (N=100):

assertTrue("available memory" + pool.availableMemory(), 
    pool.availableMemory() >= 9 && pool.availableMemory() <= 10);

Here's a summary of the test history:

This patch decreases the lower bound for expected available memory, as thread scheduling entails that a variable amount of deallocation happens by the point of assertion.

The patch also makes minor clarifications to test logic and comments. On my machine, it has a 100% passing rate (N=500).

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@ijuma
Copy link
Copy Markdown
Member

ijuma commented Oct 16, 2018

@smccauliff @lindong28 @becketqin can one of you review this?

@ijuma
Copy link
Copy Markdown
Member

ijuma commented Oct 16, 2018

Thanks for the PR @colinhicks. What is the actual value of availableMemory when the test fails?

@colinhicks
Copy link
Copy Markdown
Contributor Author

When the test fails, the value of availableMemory is 8.

Copy link
Copy Markdown
Member

@ijuma ijuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, left a couple of comments.

assertTrue("Allocation should finish not much later than maxBlockTimeMs", endTimeMs - beginTimeMs < maxBlockTimeMs + 1000);
long durationMs = Time.SYSTEM.milliseconds() - beginTimeMs;
assertTrue("TimeoutException should not throw before maxBlockTimeMs", durationMs >= maxBlockTimeMs);
assertTrue("TimeoutException should throw soon after maxBlockTimeMs", durationMs < maxBlockTimeMs + 100);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You made this stricter (+100 instead of +1000), I'd not do that given Jenkins variability.

} catch (TimeoutException e) {
// this is good
}
assertTrue("available memory" + pool.availableMemory(), pool.availableMemory() >= 9 && pool.availableMemory() <= 10);
Copy link
Copy Markdown
Member

@ijuma ijuma Oct 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of removing this, we should probably just say assert that it can be from 8 to 10 with a comment saying that thread scheduling sometimes means that no deallocation happens by the time we get here.

@colinhicks colinhicks changed the title MINOR: remove non-deterministic assertion in flaky BufferPool test. MINOR: Less restrictive assertion in flaky BufferPool test. Oct 20, 2018
assertTrue("available memory" + pool.availableMemory(), pool.availableMemory() >= 9 && pool.availableMemory() <= 10);
long endTimeMs = Time.SYSTEM.milliseconds();
assertTrue("Allocation should finish not much later than maxBlockTimeMs", endTimeMs - beginTimeMs < maxBlockTimeMs + 1000);
// Thread scheduling sometimes means that no deallocation happens by this point
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my bad, but we should probably say that deallocation varies instead of "no deallocation". 8 implies that at least one of them happened.

Copy link
Copy Markdown
Member

@ijuma ijuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@colinhicks
Copy link
Copy Markdown
Contributor Author

Thanks for the review and comments @ijuma. I also updated the PR title and description to match the changes you suggested.

@ijuma
Copy link
Copy Markdown
Member

ijuma commented Oct 20, 2018

Perfect, thanks.

@lindong28
Copy link
Copy Markdown
Member

@ijuma Sorry, I didn't notice your message previously.

@ijuma ijuma changed the title MINOR: Less restrictive assertion in flaky BufferPool test. MINOR: Less restrictive assertion in flaky BufferPool test Oct 21, 2018
@ijuma ijuma merged commit 74f686d into apache:trunk Oct 21, 2018
ijuma pushed a commit that referenced this pull request Oct 21, 2018
Decrease the lower bound for expected available memory, as thread
scheduling entails that a variable amount of deallocation happens by
the point of assertion.

Also make minor clarifications to test logic and comments.

The passing rate improved from 98% to 100% locally after these
changes (100+ runs).

Reviewers: Ismael Juma <ismael@juma.me.uk>

### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation 
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)
ijuma pushed a commit that referenced this pull request Oct 21, 2018
Decrease the lower bound for expected available memory, as thread
scheduling entails that a variable amount of deallocation happens by
the point of assertion.

Also make minor clarifications to test logic and comments.

The passing rate improved from 98% to 100% locally after these
changes (100+ runs).

Reviewers: Ismael Juma <ismael@juma.me.uk>

### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation 
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)
@ijuma
Copy link
Copy Markdown
Member

ijuma commented Oct 21, 2018

Merged to trunk, 2.1 and 2.0 branches.

@colinhicks colinhicks deleted the fix-buffer-pool-test-flakiness branch October 21, 2018 21:04
pengxiaolong pushed a commit to pengxiaolong/kafka that referenced this pull request Jun 14, 2019
Decrease the lower bound for expected available memory, as thread
scheduling entails that a variable amount of deallocation happens by
the point of assertion.

Also make minor clarifications to test logic and comments.

The passing rate improved from 98% to 100% locally after these
changes (100+ runs).

Reviewers: Ismael Juma <ismael@juma.me.uk>

### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation 
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants