-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Avoid race while mocking compactor #7102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
There seems to be a race when mocking the compactor in MockedPulsarServiceBaseTest which yields errors like: ``` Caused by: java.lang.ClassCastException: org.apache.pulsar.compaction.TwoPhaseCompactor$MockitoMock$1141048386 cannot be cast to org.apache.pulsar.broker.service.BrokerService ``` This obviously causes tests to fail in strange and surprising ways. I've not been able to reproduce, but the issue seems to be with how we mock the compactor. We don't want to have to construct the whole compactor as we just want to spy on it, so we get the current compactor, wrap it in spy and tell the PulsarService instance to return it when getCompactor is called. However, we are doing this after we have already called PulsarService#start(). So there are threads already using the mock structures. The mock structures themselves are not thread safe, so modifying them while they are in use, is not a good idea. The fix is to finish mocking before invoking #start(). Do do this, I've broken out the Compactor construction method in PulsarService, so that alone can be mocked to wrap the object in a spy.
sijie
approved these changes
May 29, 2020
codelipenghui
approved these changes
May 30, 2020
aahmed-se
approved these changes
May 31, 2020
Huanli-Meng
pushed a commit
to Huanli-Meng/pulsar
that referenced
this pull request
Jun 1, 2020
There seems to be a race when mocking the compactor in MockedPulsarServiceBaseTest which yields errors like: ``` Caused by: java.lang.ClassCastException: org.apache.pulsar.compaction.TwoPhaseCompactor$MockitoMock$1141048386 cannot be cast to org.apache.pulsar.broker.service.BrokerService ``` This obviously causes tests to fail in strange and surprising ways. I've not been able to reproduce, but the issue seems to be with how we mock the compactor. We don't want to have to construct the whole compactor as we just want to spy on it, so we get the current compactor, wrap it in spy and tell the PulsarService instance to return it when getCompactor is called. However, we are doing this after we have already called PulsarService#start(). So there are threads already using the mock structures. The mock structures themselves are not thread safe, so modifying them while they are in use, is not a good idea. The fix is to finish mocking before invoking #start(). Do do this, I've broken out the Compactor construction method in PulsarService, so that alone can be mocked to wrap the object in a spy. Co-authored-by: Ivan Kelly <ikelly@splunk.com>
Huanli-Meng
pushed a commit
to Huanli-Meng/pulsar
that referenced
this pull request
Jun 1, 2020
There seems to be a race when mocking the compactor in MockedPulsarServiceBaseTest which yields errors like: ``` Caused by: java.lang.ClassCastException: org.apache.pulsar.compaction.TwoPhaseCompactor$MockitoMock$1141048386 cannot be cast to org.apache.pulsar.broker.service.BrokerService ``` This obviously causes tests to fail in strange and surprising ways. I've not been able to reproduce, but the issue seems to be with how we mock the compactor. We don't want to have to construct the whole compactor as we just want to spy on it, so we get the current compactor, wrap it in spy and tell the PulsarService instance to return it when getCompactor is called. However, we are doing this after we have already called PulsarService#start(). So there are threads already using the mock structures. The mock structures themselves are not thread safe, so modifying them while they are in use, is not a good idea. The fix is to finish mocking before invoking #start(). Do do this, I've broken out the Compactor construction method in PulsarService, so that alone can be mocked to wrap the object in a spy. Co-authored-by: Ivan Kelly <ikelly@splunk.com>
Huanli-Meng
pushed a commit
to Huanli-Meng/pulsar
that referenced
this pull request
Jun 12, 2020
There seems to be a race when mocking the compactor in MockedPulsarServiceBaseTest which yields errors like: ``` Caused by: java.lang.ClassCastException: org.apache.pulsar.compaction.TwoPhaseCompactor$MockitoMock$1141048386 cannot be cast to org.apache.pulsar.broker.service.BrokerService ``` This obviously causes tests to fail in strange and surprising ways. I've not been able to reproduce, but the issue seems to be with how we mock the compactor. We don't want to have to construct the whole compactor as we just want to spy on it, so we get the current compactor, wrap it in spy and tell the PulsarService instance to return it when getCompactor is called. However, we are doing this after we have already called PulsarService#start(). So there are threads already using the mock structures. The mock structures themselves are not thread safe, so modifying them while they are in use, is not a good idea. The fix is to finish mocking before invoking #start(). Do do this, I've broken out the Compactor construction method in PulsarService, so that alone can be mocked to wrap the object in a spy. Co-authored-by: Ivan Kelly <ikelly@splunk.com>
huangdx0726
pushed a commit
to huangdx0726/pulsar
that referenced
this pull request
Aug 24, 2020
There seems to be a race when mocking the compactor in MockedPulsarServiceBaseTest which yields errors like: ``` Caused by: java.lang.ClassCastException: org.apache.pulsar.compaction.TwoPhaseCompactor$MockitoMock$1141048386 cannot be cast to org.apache.pulsar.broker.service.BrokerService ``` This obviously causes tests to fail in strange and surprising ways. I've not been able to reproduce, but the issue seems to be with how we mock the compactor. We don't want to have to construct the whole compactor as we just want to spy on it, so we get the current compactor, wrap it in spy and tell the PulsarService instance to return it when getCompactor is called. However, we are doing this after we have already called PulsarService#start(). So there are threads already using the mock structures. The mock structures themselves are not thread safe, so modifying them while they are in use, is not a good idea. The fix is to finish mocking before invoking #start(). Do do this, I've broken out the Compactor construction method in PulsarService, so that alone can be mocked to wrap the object in a spy. Co-authored-by: Ivan Kelly <ikelly@splunk.com>
jiazhai
pushed a commit
to streamnative/kop
that referenced
this pull request
Jul 29, 2021
### Motivation The KoP CI tests take much more time than CI tests of branch-2.7.2. The main reason is the `cleanup()` phase takes a long time, each time a test is cleaned up, it will take over 10 seconds to complete. This behavior was introduced from apache/pulsar#10199, which made broker shutdown gracefully by default but it would take longer to shutdown. The other reason is caused by rebalance time. According to my observes, when a Kafka consumer subscribes a topic in KoP, it will take at least 3 seconds. Finally I found it's caused by the GroupInitialRebalanceDelayMs config, which has the same semantics with Kafka's [group.initial.rebalance.delay.ms](https://kafka.apache.org/documentation/#brokerconfigs_group.initial.rebalance.delay.ms). It makes Kafka server wait longer for `JOIN_GROUP` request for more consumers to join so that the rebalance count can reduce. However, it should be set zero in tests. After fixing these problems, sometimes the following error may happen and cause flakiness. ``` TwoPhaseCompactor$MockitoMock$1432102834 cannot be returned by getConfiguration() getConfiguration() should return ServiceConfiguration *** If you're unsure why you're getting above error read on. Due to the nature of the syntax above problem might occur because: 1. This exception *might* occur in wrongly written multi-threaded tests. Please refer to Mockito FAQ on limitations of concurrency testing. 2. A spy is stubbed using when(spy.foo()).then() syntax. It is safer to stub spies - - with doReturn|Throw() family of methods. More in javadocs for Mockito.spy() method. ``` It's because `PulsarService#newCompactor` is not mocked well, see apache/pulsar#7102 for detail. ### Modifications - Configure `GroupInitialRebalanceDelayMs` and `BrokerShutdownTimeoutMs` for each mocked `BrokerService`. - Fix the flakiness caused by mocking compactor. After the changes, the tests time has a significant improvement. For example, `GroupCoordinatorTest` takes only 3 minutes now but it could take 9 minutes before. Because the `cleanup()` method is marked as `@AfterMethod` and would be called each time a single test finished. Another example is that `BasicEndToEndKafkaTest` takes only 37 seconds now but it could take 56 seconds before. The `cleanup()` is marked as `@AfterClass` and only happens once, but many consumers will be created during the whole tests and each time a subscribe call can take 3 seconds. * Speed up tests and fix flakiness * Speed up tests that creat their own configs * Ignore golang-sarama tests
BewareMyPower
added a commit
to streamnative/kop
that referenced
this pull request
Aug 5, 2021
### Motivation The KoP CI tests take much more time than CI tests of branch-2.7.2. The main reason is the `cleanup()` phase takes a long time, each time a test is cleaned up, it will take over 10 seconds to complete. This behavior was introduced from apache/pulsar#10199, which made broker shutdown gracefully by default but it would take longer to shutdown. The other reason is caused by rebalance time. According to my observes, when a Kafka consumer subscribes a topic in KoP, it will take at least 3 seconds. Finally I found it's caused by the GroupInitialRebalanceDelayMs config, which has the same semantics with Kafka's [group.initial.rebalance.delay.ms](https://kafka.apache.org/documentation/#brokerconfigs_group.initial.rebalance.delay.ms). It makes Kafka server wait longer for `JOIN_GROUP` request for more consumers to join so that the rebalance count can reduce. However, it should be set zero in tests. After fixing these problems, sometimes the following error may happen and cause flakiness. ``` TwoPhaseCompactor$MockitoMock$1432102834 cannot be returned by getConfiguration() getConfiguration() should return ServiceConfiguration *** If you're unsure why you're getting above error read on. Due to the nature of the syntax above problem might occur because: 1. This exception *might* occur in wrongly written multi-threaded tests. Please refer to Mockito FAQ on limitations of concurrency testing. 2. A spy is stubbed using when(spy.foo()).then() syntax. It is safer to stub spies - - with doReturn|Throw() family of methods. More in javadocs for Mockito.spy() method. ``` It's because `PulsarService#newCompactor` is not mocked well, see apache/pulsar#7102 for detail. ### Modifications - Configure `GroupInitialRebalanceDelayMs` and `BrokerShutdownTimeoutMs` for each mocked `BrokerService`. - Fix the flakiness caused by mocking compactor. After the changes, the tests time has a significant improvement. For example, `GroupCoordinatorTest` takes only 3 minutes now but it could take 9 minutes before. Because the `cleanup()` method is marked as `@AfterMethod` and would be called each time a single test finished. Another example is that `BasicEndToEndKafkaTest` takes only 37 seconds now but it could take 56 seconds before. The `cleanup()` is marked as `@AfterClass` and only happens once, but many consumers will be created during the whole tests and each time a subscribe call can take 3 seconds. * Speed up tests and fix flakiness * Speed up tests that creat their own configs * Ignore golang-sarama tests
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
There seems to be a race when mocking the compactor in
MockedPulsarServiceBaseTest which yields errors like:
This obviously causes tests to fail in strange and surprising
ways. I've not been able to reproduce, but the issue seems to be with
how we mock the compactor.
We don't want to have to construct the whole compactor as we just want
to spy on it, so we get the current compactor, wrap it in spy and tell
the PulsarService instance to return it when getCompactor is called.
However, we are doing this after we have already called
PulsarService#start(). So there are threads already using the mock
structures. The mock structures themselves are not thread safe, so
modifying them while they are in use, is not a good idea.
The fix is to finish mocking before invoking #start(). Do do this,
I've broken out the Compactor construction method in PulsarService, so
that alone can be mocked to wrap the object in a spy.