Skip to content

KAFKA-12503: cache size set by the thread#10356

Merged
ableegoldman merged 2 commits intoapache:trunkfrom
wcarlson5:cache-size-set-by-thread
Mar 19, 2021
Merged

KAFKA-12503: cache size set by the thread#10356
ableegoldman merged 2 commits intoapache:trunkfrom
wcarlson5:cache-size-set-by-thread

Conversation

@wcarlson5
Copy link
Copy Markdown
Contributor

Make is so threads do not directly resize other threads caches

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@wcarlson5 wcarlson5 changed the title cache size set by the thread HOTFIX: cache size set by the thread Mar 18, 2021
@wcarlson5 wcarlson5 changed the title HOTFIX: cache size set by the thread KAFKA-12503: cache size set by the thread Mar 18, 2021
@wcarlson5
Copy link
Copy Markdown
Contributor Author

@guozhangwang @ableegoldman Can you take a look?

@vvcephei I think this needs to be picked to 2.8 as well

Copy link
Copy Markdown
Member

@ableegoldman ableegoldman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Can you think of any way to test this out? Concurrency bugs are always a long-shot to reproduce reliably, but we should at least take that shot. Maybe start up a Streams app with ~5 threads and a large-ish cache, give it a MAX_VALUE commit interval so it won't flush, and feed it enough data to fill up the cache? There should be an integration test util that lets you wait until the consumer lag is zero so you know all the data is there -- then remove a thread

(I know we're a bit crunched for time so it's ok with me to merge this PR as-is, but please follow up after with a test)

@ableegoldman
Copy link
Copy Markdown
Member

One unrelated test failure: kafka.api.SaslGssapiSslEndToEndAuthorizationTest.testNoConsumeWithDescribeAclViaAssign

@wcarlson5
Copy link
Copy Markdown
Contributor Author

I will don't know about testing this via an integration test. I worry about introducing just another flaky test, these types of issue get caught mostly in soak. Even with the optimal conditions for it to go wrong it took a couple days.

I will think about how to improve the tests but that will happen in a different PR for sure

@ableegoldman
Copy link
Copy Markdown
Member

ableegoldman commented Mar 19, 2021

I don't think we should aim to catch things in soak, although I'm not on the side of "we should never catch anything in soak/everything must have an integration test" either. It's just that this particular issue, and this feature in general really, calls for an integration test -- we should be flexing the cache resize codepath in a realistic scenario. I'm happy to leave that as followup work as long as we do get around to it.

This looks good and the build passed so I'll merge this to unblock the release

@ableegoldman ableegoldman merged commit 367eca0 into apache:trunk Mar 19, 2021
ableegoldman pushed a commit that referenced this pull request Mar 19, 2021
… for them (#10356)

Make it so threads do not directly resize other thread's caches

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
@ableegoldman
Copy link
Copy Markdown
Member

Merged and cherrypicked to 2.8 cc @vvcephei

ijuma added a commit to confluentinc/kafka that referenced this pull request Mar 19, 2021
Conflicts:
* build.gradle: keep `dependencySubstitution` Confluent addition in
`resolutionStrategy` and take upstream changes.

Commits:
* apache-github/trunk:
  KAFKA-12503: inform threads to resize their cache instead of doing so for them (apache#10356)
  KAFKA-10697: Remove ProduceResponse.responses (apache#10332)
  MINOR: Exclude KIP-500.md from rat check (apache#10354)
  MINOR: Move `configurations.all` to be a child of `allprojects` (apache#10349)
  MINOR: Remove use of `NoSuchElementException` in `KafkaMetadataLog` (apache#10344)
  MINOR: Start the broker-to-controller channel for request forwarding (apache#10340)
  KAFKA-12382: add a README for KIP-500 (apache#10227)
  MINOR: Fix BaseHashTable sizing (apache#10334)
  KAFKA-10357: Add setup method to internal topics (apache#10317)
  MINOR: remove redundant null check when testing specified type (apache#10314)
  KAFKA-12293: Remove JCenter from buildscript and delete buildscript.gradle
  KAFKA-12491: Make rocksdb an `api` dependency for `streams` (apache#10341)
  KAFKA-12454: Add ERROR logging on kafka-log-dirs when given brokerIds do not  exist in current kafka cluster (apache#10304)
  KAFKA-12459; Use property testing library for raft event simulation tests (apache#10323)
  MINOR: fix failing ZooKeeper system tests (apache#10297)
  MINOR: fix client_compatibility_features_test.py (apache#10292)
lct45 pushed a commit to confluentinc/kafka that referenced this pull request Mar 19, 2021
… for them (apache#10356)

Make it so threads do not directly resize other thread's caches

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
vvcephei pushed a commit to confluentinc/kafka that referenced this pull request Mar 20, 2021
… for them (apache#10356)

Make it so threads do not directly resize other thread's caches

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
@wcarlson5 wcarlson5 deleted the cache-size-set-by-thread branch April 3, 2021 01:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants