feature-13324 contol loading lookups in peons#16266
feature-13324 contol loading lookups in peons#16266IgorBerman wants to merge 1 commit intoapache:masterfrom
Conversation
703dd32 to
7e83988
Compare
There was a problem hiding this comment.
@IgorBerman , thanks for your solution! It makes sense to optimize lookup loading, but we need to leave some room in this design for future improvements. Please read comments for details.
| ); | ||
| } | ||
|
|
||
| default boolean doesntNeedLookups() |
There was a problem hiding this comment.
The method should rather be called needsLookups() or loadLookups(). There is no point naming this method in the negative.
Also, since we want to extend this feature in the future to be able to load lookups on demand, I think we should return an enum here with possible values NONE, ALL, ON_DEMAND.
For the time-being, the behaviour for ALL and ON_DEMAND would be the same. But later, ON_DEMAND can be optimized to load only required lookups.
So, overall:
default LookupLoadingMode needsLookups()
{
return LookupLoadingMode.ALL;
}There was a problem hiding this comment.
thanks @kfaraz for the comments.
Makes sense to me.
I'll ping you when your suggestions will be addressed.
Also here is something that I've discovered lately and if you have some suggestions/ideas please share them: I've tried to start nano cluster with my changes and discovered that it's not that simple. TaskToolbox injects LookupNodeService, i.e. currently it's not possible to remove it without additional refactoring, so I'm trying to convert it to "Lazy" one by injecting Providing instead so that LookupSerdeModule could create this Provider as well
| } | ||
|
|
||
| if (task.doesntNeedLookups()) { | ||
| command.add("--doesntNeedLookups"); |
There was a problem hiding this comment.
The parameter should be called --loadLookups instead, and should take values all, none, onDemand.
4443e93 to
42fc8e3
Compare
42fc8e3 to
199e726
Compare
|
run nano cluster locally and verified that all services started, ingestion from kafka works and compaction also able to work. |
kfaraz
left a comment
There was a problem hiding this comment.
Thanks, @IgorBerman ! Overall, this looks like a neat approach to me. I have left some comments.
|
|
||
|
|
||
| /** | ||
| * If set to true then lookups won't be loaded |
There was a problem hiding this comment.
This javadoc needs to be updated.
| new InputSourceModule(), | ||
| new ChatHandlerServerModule(properties), | ||
| new LookupModule() | ||
| LookupLoadingMode.NONE == loadLookups ? new LookupSerdeModule() : new LookupModule() |
There was a problem hiding this comment.
Maybe put this logic in a private method getLookupModule() and add a javadoc there explaining what we are doing.
There was a problem hiding this comment.
A log line which indicates the chosen lookup loading mode and the expected effects would also be helpful.
There was a problem hiding this comment.
added log and private method. also added tests for it
| } | ||
|
|
||
| @Test | ||
| public void testTaskRun_doesnt_pass_loadLookups_Argument() throws Exception |
There was a problem hiding this comment.
The other tests in this class don't use the underscore style of naming tests. Please adhere to the existing style in this class if possible.
dc80221 to
fe1b2cb
Compare
|
@kfaraz Hi, I think I addressed all of your comments. Also hopefully fixed coverage. |
kfaraz
left a comment
There was a problem hiding this comment.
Thanks for promptly addressing the feedback! I have left a few more suggestions.
| Assert.assertTrue(peon.resolveLookupModule() instanceof LookupSerdeModule); | ||
| } | ||
|
|
||
| @Test(expected = IllegalArgumentException.class) |
There was a problem hiding this comment.
Please use Assert.assertThrows() instead.
There was a problem hiding this comment.
replaced with assertThrows
| { | ||
| JsonConfigProvider.bind(binder, LookupModule.PROPERTY_BASE, LookupConfig.class); | ||
| //Even though LookupSerdeModule will be used for processes that are not loading lookups, we need to bind LookupNodeService, | ||
| //so that TaskToolboxFactory and TaskToolbox will get their dependency. Binding it won't load lookups. |
There was a problem hiding this comment.
Hmm, I suppose we could have just made this dependency @Nullable in TaskToolboxFactory.
The only usage is to build a DruidDiscoveryNode where the LookupNodeService object is used to populate the services map.
From the point of service discovery, I think it would be more appropriate to just not add the lookup node service to the map rather than declaring a dummy node service.
There was a problem hiding this comment.
tried this and got following error for compaction task
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: com.google.inject.CreationException: Unable to create injector, see the following errors:
1) Could not find a suitable constructor in org.apache.druid.discovery.LookupNodeService. Classes must have either one (and only one) constructor annotated with @Inject or a zero-argument constructor that is not private.
at org.apache.druid.discovery.LookupNodeService.class(LookupNodeService.java:38)
while locating org.apache.druid.discovery.LookupNodeService
for the 26th parameter of org.apache.druid.indexing.common.TaskToolboxFactory.<init>(TaskToolboxFactory.java:165)
at org.apache.druid.cli.CliPeon.bindTaskConfigAndClients(CliPeon.java:506) (via modules: com.google.inject.util.Modules$OverrideModule -> com.google.inject.util.Modules$OverrideModule -> org.apache.druid.cli.CliPeon$1)
1 error
at org.apache.druid.cli.CliPeon.run(CliPeon.java:421)
at org.apache.druid.cli.Main.main(Main.java:112)
Caused by: java.lang.RuntimeException: com.google.inject.CreationException: Unable to create injector, see the following errors:
There was a problem hiding this comment.
I'm not too familiar with Guice(using spring boot at work) so might be there is other option to have Optional dependency?
There was a problem hiding this comment.
actually @nullable is the way to mark it optional...so I'm not sure why it doesn't work
There was a problem hiding this comment.
maybe in additional to @nullable need to add empty c'tor to LookupNodeService ?
There was a problem hiding this comment.
No problem, we can try to address this in a follow up.
b559004 to
92144e8
Compare
|
@kfaraz thanks for the review! I think I've addressed all your comments besides binding of Lookup service. Tried your idea but seems compaction peon fails with this approach. |
1de34f8 to
1355717
Compare
dd23158 to
57c765f
Compare
|
I see that 1 test failing: org.apache.druid.indexing.overlord.ForkingTaskRunnerTest#testInvalidTaskContextJavaOptsArray both of them updated at which doesn't happen when parsing of arguments failing here which throws exception and which is goes up to outer i.e. those static variables are not incremented as part of failure to parse arguments(maybe should be fixed) |
|
@zachjsh Hi, maybe you can advice regarding org.apache.druid.indexing.overlord.ForkingTaskRunnerTest#testInvalidTaskContextJavaOptsArray and updating FAILED_TASK_COUNT & SUCCESSFUL_TASK_COUNT? |
|
run same test on master branch and it's failing as well, which is strange |
Yeah, oddly enough, for me, the test fails only when I run it individually. When I run all the tests in this class from the IDE together, it passes. It is most likely because the failed/success task counts are static. A test that runs before this one must be setting the failed count to 1, which is being verified in this test too. |
|
@kfaraz can you tell me what I should do in addition for this PR to be merged? |
|
@IgorBerman , I have created #16323 to fix up the test for now so that the builds pass. Later, we can revisit the code and decide if we want to increment the failed task count in the case where the javaOptsArray is invalid. |
|
thanks @kfaraz , I left 1 comment there |
|
@IgorBerman , I have merged #16323 . You can try to merge the latest master into your branch to fix the errors. |
57c765f to
a120dfe
Compare
|
thanks @kfaraz , updated my branch with upstream |
|
@IgorBerman , FYI, there is another PR #16328 which tackles the same problem as this one. Let me know what you think about it. |
|
@kfaraz thanks :) I looked at it and it seems nice approach. I think the other PR is more generic since it permits to control which lookups will be loaded. I haven't understood if it supports configuring it as some context arguments for task. In our case it's always all or nothing, but I can imagine more involved environments. Additional diff that I can think of is that if we are not going to load lookups current PR uses different guice module which might create lighter runtime(i.e. less beans e.g., but I'm not sure this is the case since I'm not too familiar with differences of those 2 modules really). |
The author intends to create that as a follow up PR.
Thanks, @IgorBerman . |
Fixes #13324
Permits to disable lookups loading for some types of peons
for example compaction or kill tasks
this should improve memory footprint
e.g. for compaction we can use this memory for loading more rows into memory and thus improving throughput of compaction
I've followed same path as loadBroadcastSegments parameter works and reused LookupSerdeModule which already used in multiple services that doesn't need to load lookups
The fix adds doesntNeedLookups method to interface of task with default of false
Release note
Key changed/added classes in this PR
Task- added doesntNeedLookups with default implementation that returns false, i.e. most of peons will continue to load lookupsCompactionTask- overrides it with trueCliPeon- refers to the process parameter and loads relevant lookup module(which either loads lookups or not)This PR has: