Skip to content

MINOR: Handle expandIsr in PartitionLockTest and ensure read threads not blocked on write#7973

Merged
rajinisivaram merged 2 commits intoapache:trunkfrom
rajinisivaram:MINOR-fix-partition-lock-test
Jan 17, 2020
Merged

MINOR: Handle expandIsr in PartitionLockTest and ensure read threads not blocked on write#7973
rajinisivaram merged 2 commits intoapache:trunkfrom
rajinisivaram:MINOR-fix-partition-lock-test

Conversation

@rajinisivaram
Copy link
Copy Markdown
Contributor

Noticed a PR build failure in the new PartitionLockTest:

ERROR Exception during updateFollowerFetchState (kafka.cluster.PartitionLockTest:76)
scala.MatchError: null
	at kafka.cluster.Partition.maybeUpdateIsrAndVersion(Partition.scala:1193)
	at kafka.cluster.Partition.expandIsr(Partition.scala:1183)
	at kafka.cluster.Partition.$anonfun$maybeExpandIsr$2(Partition.scala:690)
	at kafka.cluster.Partition.maybeExpandIsr(Partition.scala:686)
	at kafka.cluster.Partition.updateFollowerFetchState(Partition.scala:609)
	at kafka.cluster.PartitionLockTest.$anonfun$updateFollowerFetchState$1(PartitionLockTest.scala:267)
	at kafka.cluster.PartitionLockTest.updateFollowerFetchState(PartitionLockTest.scala:257)

The mocked instance was not handling expandIsr, so updated the test to handle this. Also updated the test to use the offset from the batch to trigger expandIsr more often. With this change, the test may try to acquire write lock while appends are blocked on read lock, so separated out the test using write lock to make it safer.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rajinisivaram. This looks good. Left some minor comments.

stateUpdateFutures.foreach(_.get(15, TimeUnit.SECONDS))

appendSemaphore.release(1)
scheduleUpdateFollowers(1).foreach(_.get(15, TimeUnit.SECONDS))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just ensuring there is one call to updateFollowers with every append?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it isn't really necessary for the test, but kept it anyway and added a comment.

* Verify that follower state updates complete even though an append holding read lock is in progress.
* Then release the permit for the final append and verify that all appends and follower updates complete.
*/
private def concurrentProduceFetchWithReadLockOnly(appendSemaphore: Semaphore): Unit = {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would be nice to be consistent on the usage of class fields. This wouldn't work if the passed semaphore wasn't the class field since that is what we used when building the log. Maybe we can drop the parameter?


assertFalse(stateUpdateFutures.exists(_.isDone))
shrinkIsrSemaphore.release()
appendSemaphore.release(numProducers * numRecordsPerProducer)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking my understanding, but it seems there's no need to do this after the shrink semaphore is released. If we did it before, then we could assert that the append futures are blocked just like the update follower futures.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, updated.

@rajinisivaram
Copy link
Copy Markdown
Contributor Author

@hachikuji Thanks for the review, addressed the comments.

@rajinisivaram
Copy link
Copy Markdown
Contributor Author

retest this please

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hachikuji
Copy link
Copy Markdown
Contributor

retest this please

@rajinisivaram
Copy link
Copy Markdown
Contributor Author

@hachikuji Thanks for the review, merging to trunk.

@rajinisivaram rajinisivaram merged commit eb8e2a8 into apache:trunk Jan 17, 2020
ijuma added a commit to confluentinc/kafka that referenced this pull request Jan 21, 2020
Conflicts or compilation errors due to the fact that we temporarily
reverted the commit that removes Scala 2.11 support:

* AclCommand.scala: take upstream changes.
* AclCommandTest.scala: take upstream changes.
* TransactionCoordinatorTest.scala: don't use SAMs, but adjust
mock call to putTransactionStateIfNotExists given new signature.
* TransactionStateManagerTest: use Runnable instead of SAMs.
* PartitionLockTest: use Runnable instead of SAMs.
* docs/upgrade.html: take upstream changes excluding line that
states that Scala 2.11 support has been removed.

* apache-github/trunk: (28 commits)
  KAFKA-9457; Fix flaky test org.apache.kafka.common.network.SelectorTest.testGracefulClose (apache#7989)
  MINOR: Update AclCommand help message to match implementation (apache#7990)
  MINOR: Update introduction page in Kafka documentation
  MINOR: Use Math.min for StreamsPartitionAssignor#updateMinReceivedVersion method (apache#7954)
  KAFKA-9338; Fetch session should cache request leader epoch (apache#7970)
  KAFKA-9329; KafkaController::replicasAreValid should return error message (apache#7865)
  KAFKA-9449; Adds support for closing the producer's BufferPool. (apache#7967)
  MINOR: Handle expandIsr in PartitionLockTest and ensure read threads not blocked on write (apache#7973)
  MINOR: Fix typo in connect integration test class name (apache#7976)
  KAFKA-9218: MirrorMaker 2 can fail to create topics (apache#7745)
  KAFKA-8847; Deprecate and remove usage of supporting classes in kafka.security.auth (apache#7966)
  MINOR: Suppress DescribeConfigs Denied log during CreateTopics (apache#7971)
  [MINOR]: Fix typo in Fetcher comment (apache#7934)
  MINOR: Remove unnecessary call to `super` in `MetricConfig` constructor (apache#7975)
  MINOR: fix flaky StreamsUpgradeTestIntegrationTest (apache#7974)
  KAFKA-9431: Expose API in KafkaStreams to fetch all local offset lags (apache#7961)
  KAFKA-9235; Ensure transaction coordinator is stopped after replica deletion (apache#7963)
  KAFKA-9410; Make groupId Optional in KafkaConsumer (apache#7943)
  MINOR: Removed accidental double negation in error message. (apache#7834)
  KAFKA-6144: IQ option to query standbys (apache#7962)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants