fix injection failure of StorageLocationSelectorStrategy objects#10363
fix injection failure of StorageLocationSelectorStrategy objects#10363suneet-s merged 10 commits intoapache:masterfrom
Conversation
|
Tagged "Design Review" since it changes user-facing configurations. |
suneet-s
left a comment
There was a problem hiding this comment.
I have an alternate approach recommended that I think will be more robust.
There was a problem hiding this comment.
I think this approach is brittle and might break when a new implementation of StorageLocationSelectorStrategy is added. Instead, I think we should make List<StorageLocation> storageLocations injectable instead via a module, something like
@Provides
@Singleton
public List<StorageLocation> provideStorageLocations(SegmentLoaderConfig config)
{
this.locations = new ArrayList<>();
for (StorageLocationConfig locationConfig : config.getLocations()) {
locations.add(
new StorageLocation(
locationConfig.getPath(),
locationConfig.getMaxSize(),
locationConfig.getFreeSpacePercent()
)
);
return locations;
}
There was a problem hiding this comment.
Hi @suneet-s I check the code and find that this approach is a little bit complex for the existing code.
Storage selector strategy object is constructed during deserialization of SegmentLoaderConfig, and only after the construction of SegmentLoaderConfig, could it be possible to inject SegmentLoaderConfig to other objects to get list of StorageLocationConfig inside the object.
If we want to inject StorageLocation objects as the way you suggest, the strategy object must be separated from SegmentLoaderConfig into another new config class so that both SegmentLoaderConfig and this new config class can be injected into SegmentLocalCacheManager. There will be lots of test cases involved to change to meet the new ctor of SegmentLocalCacheManager. So I think these change involve more complexity.
Back to the concern you mentioned, I think there's no need to worry that new implementation would break the constraints. Because if it breaks, the problem would be easily detected during unit test or integrated test.
What do you think ?
There was a problem hiding this comment.
Is this a doc fix? I don't see an associated change in the code. Am I missing something?
There was a problem hiding this comment.
It's a fix. According to original code, annotation on StorageLocationSelectorStrategy indicates the name of json property is type. When using druid.segmentCache.locationSelectorStrategy, jackson tries to find ctor with a String parameter , which causes the issue.
|
Hi @jihoonson @suneet-s , here's a clarification for the change of user facing configuration. This PR has no code change with the configuration name, but a rectification of the doc from the wrong configuration item name of selector strategy named as When I first checked the issue, the doc also made me confused and took me sometime on the question which was the right name. Looking at the annotation of Based upon the code, the right configuration name should I don't know why the And the doc says the property name should be |
|
@jihoonson @suneet-s I come up with an idea that requires no change to the existing configuration.
The disadvantage is that all properties extended by implementations of What do u think ? |
|
@FrankChen021 I'll look through these changes over the next couple of days and get back to you. Thanks for the fix and your patience :) |
|
The latest CI reports that
I don't know how to handle it. Do you have any idea ? @suneet-s @asdf2014 |
This is because of a mismatch between the version used in Other CI failures look unrelated. I just restarted them. |
There was a problem hiding this comment.
I think the previous implementation is better since you can use the configuration name druid.segmentCache.locationSelectorStrategy without type. Is there a reason that locationSelectorStrategy cannot be here?
There was a problem hiding this comment.
This is because StorageLocationStrategy depends on SegmentLoaderConfig.locations. If StorageLocationStrategy is placed here, when SegmentLoaderConfig is being deserialized, locations required by StorageLocationStrategy can't be found and injected into it.
After moving this property out of SegmentLoaderConfig, both SegmentLoaderConfig and StorageLocationStrategy are deserialized during construction of SegmentLoaderLocalCacheManager, and jackson could find the locations objects required by strategy object through google guice framework to create strategy object correctly.
Using the configuration name without type I think is wrong. Please take a look at this configuration druid.coordinator.balancer.strategy, or the clarification on the change of configuration name I left above.
There was a problem hiding this comment.
Got it. Then, how about using a similar naming to the coordinator balancer? For example, we can bind StorageLocationSelectorStrategy to druid.segmentCache.locationSelector, and use a strategy property name for StorageLocationSelectorStrategy instead of type.
There was a problem hiding this comment.
Sorry for the late reply.
Your suggestion makes the naming more meaningful. I'll update this PR later this day.
jihoonson
left a comment
There was a problem hiding this comment.
LGTM. @FrankChen021 thank you!
|
@suneet-s do you have more comments? |
There was a problem hiding this comment.
Should this be druid.segmentCache.locationSelector instead?
There was a problem hiding this comment.
Oh sorry, this is a bug. I remember I tested it in my cluster, maybe I missed something then. The code has been updated in the latest commit.
There was a problem hiding this comment.
This binding is different from what actual module binds. I think this is why we missed the wrong binding. Can we use the same StorageNodeModule here? Or can we add a helper method which does the proper binding for both StorageNodeModule and tests?
There was a problem hiding this comment.
The reason why StorageModule is not used here is because there're some extra dependencies in StorageModule, which might introduce lots of injection code in test cases. A helper method is extracted to do the correct binding.
|
@FrankChen021 Can you describe how you tested this in your test cluster. What was the user-visible behavior before and after this change so that we can update the release notes if needed. I didn't see any integration tests added for this change, so it's possible that someone might break this behavior again in the future - would it be possible to add integration tests? |
|
Hi @suneet-s , in this PR a log is added in the ctor of
In this way, I know whether the configuration takes effect. As the test cases, I've added some unit test cases to test whether this configuration takes effect by setting |
|
Hi @jihoonson @suneet-s , CI reports that compaction integration test fails while all other checks are ok. I don't know why, could you help me take a look at it? |
@FrankChen021 It appears these tests are flaky. I re-triggered them and the job passed - could you file an issue for this flaky test please? |
jihoonson
left a comment
There was a problem hiding this comment.
LGTM. I tested the changed property working in my local machine.
|
Hi @suneet-s , do you have any further comments ? |
@FrankChen021 Sorry for the delay in this review. Since there are no integration tests, I'm trying to see what would happen on an upgrade, and whether or not this change would introduce a change in behavior. If you have integration tests that prove nothing has changed on the upgrade path, it would make this review a lot faster. Thanks for your patience |
|
@suneet-s Got it. I'm not familiar with IT. One thing I don't understand is what we should expect from the integration tests. To verify if the right type of selector object injected for corresponding configuration? |
@FrankChen021 I think the ITs should test that Druid uses the |
I see. It's an end-to-end test case. Usually it is tricky to set up an environment for such a case. I also studied some IT cases, I found that test cases send HTTP requests to nodes to verify whether they run successfully or not. But for historical nodes, there're no such interfaces exposed for us to observe the behavior of selector strategy. As you say, it's a little tricky. |
|
Hi @suneet-s , Is this PR ready to be merged ? or is there anything I need to do ? |
|
@FrankChen021 - I got pulled in to some other issues and haven't had the time to look at this yet. I will look at this over the weekend and get back to you. |
suneet-s
left a comment
There was a problem hiding this comment.
LGTM thanks for the tests and fix @FrankChen021 - sorry for the delay in reviewing this.
I'm happy to merge the fix as soon as the conflict is resolved.
….type to druid.segmentCache.locationSelector.strategy
46862d4 to
90dfa73
Compare
|
Hi @suneet-s The conflict has been resolved by rebasing the branch onto the latest master. But CI failed, I checked it, and it seems that it's not related to the changes in PR. Could you re-trigger it to see if it passes ? |
|
Thanks @FrankChen021 - looks like someone else beat me to the re-trigger. |
…che#10363) * fix to allow customer storage location selector strategy * add test cases to check instance of selector strategy * update doc * code format * resolve code review comments * inject StorageLocation * fix CI * fix mismatched license item reported by CI * change property path from druid.segmentCache.locationSelectorStrategy.type to druid.segmentCache.locationSelector.strategy * using a helper method to bind to correct property path
This PR fixes #10348 , which is caused by injection failure of storage selector strategy.
Description
Currently, all 4 implementations of
StorageLocationSelectorStrategyrequires a list ofStorageLocationobjects during construction. AndStorageLocationdiffers fromStorageLocationConfigdeserialized from configuration file, and the former is instantiated bySegmentLoaderLocalCacheManager. This also meansStorageLocationcould not be injected intoStorageLocationSelectorStrategywhen they are being constructed.In this PR,
the ctor of implementations of
StorageLocationSelectorStrategyare removed, instead, a setter of storage location method is provided in this interface so thatSegmentLoaderLocalCacheManagercould pass the objects to the strategy objectbased on the original code, configuration property should be
druid.segmentCache.locationSelectorStrategy.type, related docs are also updatedsome unit test cases are added to check whether these strategy objects are correctly instantiated from configuration properties.
This PR has: