KAFKA-6647 KafkaStreams.cleanUp creates .lock file in directory its trying to clean (Windows OS) #6569
KAFKA-6647 KafkaStreams.cleanUp creates .lock file in directory its trying to clean (Windows OS) #6569jukkakarvanen wants to merge 9 commits intoapache:trunkfrom
Conversation
…e fixed with this unlock, another failure explained in comment.
|
Simplified the refactoring and reversed logic back to original relating to locking and condition evaluation order. |
|
I wonder what is the reason: the previous build failed and now only after update of comment, it is successful. |
|
Hi @jukkakarvanen and thanks for the contribution! We've discussed this PR and there is another PR (#5650) that aims to solve the same issue with a different approach that seems a little safer to implement. Would you consider reworking your current PR to fit the approach mapped out in #5650? Ping us here an let us know when this is ready for review. Thanks |
|
Hi @bbejeck , The most important thing is that this bug get fixed. I can close this and reopen after merged idea of #5650 |
Clarification, was the conclusion in #5650:
|
|
Yes. Should be |
|
I have been checking this KAFKA-6647 with two locking policy approach. |
This KAFKA-6647 KafkaStreams.cleanUp creates .lock file in directory its trying to clean (Windows OS) has been open for a while, and long conversation in previous Pull Requests. Anyway this problems is causing the TopologyTestDriver driver based unit test failing in Windows if not adding extra exception handling in there.
This PR version is trying to keep the general functionality as similar as earlier. Only add one extra retry of delete, if first failed due to DirectoryNotExmptyException. When added the retry logic only at the end of finally, caused checkstyle CyclomaticComplexity and NPathComplexity to go above threshold.
After it extracted cleanRemovedTaskDir and deleteTaskDir methods to avoid complexity.
Also time condition evaluation changed to be first before locking, so no need to lock if inner block is doing nothing.
No external functionality changed, so no additional test cases added.
Following five test in StateDirectoryTest failed earlier in Windows.
StateDirectoryTest.shouldCleanUpTaskStateDirectoriesThatAreNotCurrentlyLocked
StateDirectoryTest.shouldCleanupStateDirectoriesWhenLastModifiedIsLessThanNowMinusCleanupDelay
StateDirectoryTest.shouldNotLockStateDirLockedByAnotherThread
StateDirectoryTest.shouldNotUnLockStateDirLockedByAnotherThread
StateDirectoryTest.shouldCleanupAllTaskDirectoriesIncludingGlobalOne
Test shouldNotLockStateDirLockedByAnotherThread is still failing due to
there is no way to unlock task of already dead thread without adding extra code to StateDirectory class.
The test left as is, but explanatory comment added. The rest four of those five tests are now successful also in Windows.
Also shouldUseSerdesDefinedInMaterializedToConsumeGlobalTable test in StreamsBuilderTest failed in Windows before this change. After the change all the tests in StreamsBuilderTest are succesful.
Committer Checklist (excluded from commit message)