[MRESOLVER-283] Shared executor service#213
Conversation
|
@caiwei-ebay ping |
|
The PR looks good to me expect a few concerns as shown in comments. Thanks for the quick fix. As the special param "aether.dependencyCollector.bf.threads" has to be removed using SharedExecutor, could you put some lines to explain the parallel download behavior in param "aether.dependencyCollector.impl" where you've introduced BF and DF. With such information, user may try the BF and feedback so we can discuss whether BF could be the default based on the feedbacks. |
| { | ||
| command.run(); | ||
| } | ||
|
|
There was a problem hiding this comment.
Given this being deleted, I think it may introduce additional cost for DF.
DF downloads pom one by one, originally it will be downloaded by main thread directly and now it will be downloaded in a thread and the main thread has to wait
There was a problem hiding this comment.
reintroduced direct strategy as well, two out of 3 users of resolver executor now relies on it.
| { | ||
| results.computeIfAbsent( ArtifactIdUtils.toId( artifact ), | ||
| key -> this.executorService.submit( callable ) ); | ||
| key -> this.executor.submit( callable ) ); |
There was a problem hiding this comment.
In the illustration below:
As we submit a Callable here, and this callable finally resorts to RepositoryConnector which will then submit a Collection (a collection with single artifact) to the executor, 2 different tasks running in one executor and besides the 2 tasks has dependencies.
The original behavior is:
we submit a Callable here, and this callable finally resorts to RepositoryConnector which will then submit a Collection (a collection with single artifact) to the executor, but this time the executor is actually a DirectExecutor which actually runs in main thread. Please check code below.
if ( maxThreads <= 1 )
{
return DirectExecutor.INSTANCE;
}
int tasks = safe( artifacts ).size() + safe( metadatas ).size();
if ( tasks <= 1 )
{
return DirectExecutor.INSTANCE;
}
There was a problem hiding this comment.
Hopefully I restored the original behaviour, please recheck (latest commit)
| @Override | ||
| public void submitOrDirect( Collection<Runnable> tasks ) | ||
| { | ||
| requireNonNull( tasks ); |
There was a problem hiding this comment.
When resolves a snapshot dependency or dependency with version range, Maven resolves metadata.xml from multiple repositories in parallel: https://github.com/apache/maven/blob/master/maven-resolver-provider/src/main/java/org/apache/maven/repository/internal/DefaultVersionRangeResolver.java#L158
With original design, metadata.xml is using a separate pool, pom or jar can use RepositoryConnector's pool but pom downloading is in serial while jar is in parallel:
- metadata.xml use a separate pool to download metadata.xml from multiple repositories, each downloading calls RepositoryConnector's get(null, Collections.singleList(MetadataDownload)), it actually uses DirectExecutor without leveraging RepositoryConnector's thread pool;
- pom downloading directly calls RepositoryConnector's get(Collections.singleList(ArtifactDownload), null), so it also uses DirectExecutor without leveraging RepositoryConnector's thread pool;
- jar downloading calls RepositoryConnector's get(List(ArtifactDownload), null), it do leverage RepositoryConnector's thread pool.
With this approach, metadata, pom and jar all share one pool:
Suppose we have multiple repositories like releases, snapshots, commercial, legacy, testing, etc. in our private Maven nexus repository and we need to resolve 5 dependencies (snapshot or with version range) in parallel.
When resolve with BFDependencyCollector, 5 tasks to resolve each dependency have used up all 5 threads, and due to resolving each dependency (let's name with task A) need to resolve metadata (task B), the metadata resolution task B will be also submitted to the same pool. Here A depends on B but B has to wait available threads. I think this could be lead to endless waiting, right?
As it is better to use separate thread pool for separate tasks, maybe the SharedExecutor here should have separate pool for separate tasks? At least, pom/jar downloading and BF collect (it also need to download pom but pom downloading actually uses directlyExecute with your recent fix, thus safe to use the same pool) can use one pool and metadata.xml downloading still use another pool? Please advice.
There was a problem hiding this comment.
I'd like also add as I understand the artifact thread pools where per repo, one this is one per all repos?
I think we need a overview what executor services where created previously and which now to compare whether this could create a bottleneck.
There was a problem hiding this comment.
Bottleneck... or deadlock caused by thread exhaustion.
|
Reworked to:
|
caiwei-ebay
left a comment
There was a problem hiding this comment.
The PR looks great to me. Thanks for tidy up the design.
|
Will try to review today |
| List<ChecksumAlgorithmFactory> checksumAlgorithmFactories = layout.getChecksumAlgorithmFactories(); | ||
| Collection<? extends MetadataDownload> mds = safe( metadataDownloads ); | ||
| Collection<? extends ArtifactDownload> ads = safe( artifactDownloads ); | ||
| ArrayList<Runnable> runnable = new ArrayList<>( mds.size() + ads.size() ); |
| this.providedChecksumsSources = providedChecksumsSources; | ||
| this.resolverExecutor = resolverExecutorService.getResolverExecutor( session, RepositoryConnector.class, | ||
| ConfigUtils.getInteger( session, CONFIG_PROP_THREADS_DEFAULT, CONFIG_PROP_THREADS, | ||
| "maven.artifact.threads" ) ); |
There was a problem hiding this comment.
We should also deprecate this property since it does not reflect reality. MD is not artifacts, but this connector does both.
There was a problem hiding this comment.
Maybe we should just remove it? We are going for 1.9 after all....
| Executor executor = getExecutor( Math.min( tasks.size(), threads ) ); | ||
| try | ||
| RunnableErrorForwarder errorForwarder = new RunnableErrorForwarder(); | ||
| ArrayList<Runnable> runnable = new ArrayList<>( tasks.size() ); |
gnodet
left a comment
There was a problem hiding this comment.
Sharing executors must be done with caution.
Is there any way that the executor may be running out of threads to service a call ? For example, resolving multiple files in parallel, each file being resolved in its own thread, then downloading dependencies and running out of threads ?
I think using a more general ForkJoinPool which supports work stealing would be safer to avoid running into such problems.
More and more component in resolver does parallel processing (BF collector, MD resolver, basic connector), and they all create, maintain their own executor instance. Instead of this, create one shared service component and just reuse it accross resolver. https://issues.apache.org/jira/browse/MRESOLVER-283
2abdff0 to
fb96453
Compare
I kinda agree with @gnodet here, and will "tone down" this PR to NOT SHARE executors, merely will introduce this facade but instances will be handled as before (as even now, it is not 100% what it was before: collector had per arg instance, now shared). And I cannot certainly asnwer the question "Is there any way that the executor may be running out of threads to service a call". OTOH, I disagree with forkjoinpool: all these threads will do (probably blocking, maybe inifinite) HTTP IO, not to mention the refactoring needed to use them (most of "client" code remained unchanged here, using plain Runnable and Callable). Hence, this PR will change to merely "centralize executor creation and (probably) use ResolverExecutor shim instead directly exposing ExecutorService iface"... |
I think a single thread pool is a good idea, but it has to be implemented in a correct way to avoid any thread exhaustion. I.e. use a single fork/join pool and use the appropriate methods to dispatch all the work, which should ensure proper ordering of tasks execution. I've done a similar thing in apache/maven#803 (which I need to finish). |
|
Ok, now I feel I should not push this anymore, as I agree now with use of forkJoinPool. Hence, will close this PR, as that change would be bigger, but I want 1.9.0 out as soon as possible.... |
|
Resolve #959 |
|
Resolve #959 |
More and more component in resolver does parallel processing (BF collector, MD resolver, basic connector), and they all create, maintain their own executor instance.
Instead of this, create one shared service component and just reuse it accross resolver.
https://issues.apache.org/jira/browse/MRESOLVER-283