Replace Processing ExecutorService with QueryProcessingPool#11382
Replace Processing ExecutorService with QueryProcessingPool#11382abhishekagarwal87 merged 7 commits intoapache:masterfrom
Conversation
aa9be88 to
9440e79
Compare
| /** | ||
| * @return - Returns this pool as an executor service that can be used for other asynchronous operations. | ||
| */ | ||
| ListeningExecutorService asExecutorService(); |
There was a problem hiding this comment.
I think this method leaks the internals of the interface. For instance, some caller may call this method and then pass the executor in other parts of code unintentionally. Also, if someone has to implement a QueryProcessingPool which is composite and contains multiple pools inside it, it would become hard to implement this interface.
Would it be better to rather have the original ExecutorService as is, and then inject that executor service to the DefaultProcessingPool? The ExecutorService interface is richer and common. QueryProcessingPool can be used in cases where a PrioritizedQueryRunnerCallable submission is needed.
There was a problem hiding this comment.
something like this?
@ExtensionPoint
public interface QueryProcessingPool extends ListeningExecutorService
{
/**
* Submits the query execution unit task for asynchronous execution.
*
* @param task - Task to be submitted.
* @param <T> - Task result type
* @param <V> - Query runner sequence type
* @return - Future object for tracking the task completion.
*/
<T, V> ListenableFuture<T> submitQueryExecution(PrioritizedQueryRunnerCallable<T, V> task);
}
There was a problem hiding this comment.
Yes, I was thinking like the snippet you mentioned above. (probably the interface can be called something like QueryRunnerProcessingPool since it only allows submit for QueryRunners currently - the submitQueryExecution can also become submit).
If this interface looks ok, then the DefaultProcessingPool's constructor signature can be : DefaultProcessingPool(@Processing ExecutorService)
There was a problem hiding this comment.
I assume you want to make QueryProcessingPool compatible with ExecutorService because of ConcurrentGrouper?
There was a problem hiding this comment.
Just clarifying my stance here, @rohangarg's idea sounds good to me, but I don't have strong preference here.
There was a problem hiding this comment.
Sounds good. I do want to keep the method separate though. Either that or PrioritizedQueryRunnerCallable<T, V> task should not extend Callable. My reasoning is that then Implementations don't have to do instance of checks for differentiating between query execution tasks and other async tasks.
Yes
I assume you want to make QueryProcessingPool compatible with ExecutorService because of ConcurrentGrouper?
There was a problem hiding this comment.
@rohangarg @jon-wei @jihoonson What do you think?
There was a problem hiding this comment.
LGTM
maybe the naming could be more generic since this would be used in both ingestion and querying layer both.
There was a problem hiding this comment.
On real-time nodes too, the pool is used in query execution.
| public @interface Processing | ||
| public interface PrioritizedQueryRunnerCallable<T, V> extends PrioritizedCallable<T> | ||
| { | ||
| QueryRunner<V> getRunner(); |
There was a problem hiding this comment.
How will this method be used? I only see it used in a test in this PR.
There was a problem hiding this comment.
This method can be used by the extensions to get the runner that the given query execution task corresponds to. That in turn can be used to fetch any state associated with the QueryRunner such as the segment info for example.
There was a problem hiding this comment.
I see. Can you add it in the javadoc of this method?
There was a problem hiding this comment.
sure thing.
| public @interface Processing | ||
| public interface PrioritizedQueryRunnerCallable<T, V> extends PrioritizedCallable<T> | ||
| { | ||
| QueryRunner<V> getRunner(); |
There was a problem hiding this comment.
I see. Can you add it in the javadoc of this method?
| package org.apache.druid.query; | ||
|
|
||
| /** | ||
| * An implementation of {@link PrioritizedCallable} that also let's caller get access to associated {@link QueryRunner} |
There was a problem hiding this comment.
| * An implementation of {@link PrioritizedCallable} that also let's caller get access to associated {@link QueryRunner} | |
| * An implementation of {@link PrioritizedCallable} that also lets caller get access to associated {@link QueryRunner} |
| * @param <T> | ||
| * @param <V> |
There was a problem hiding this comment.
Can you please complete the javadoc for the parameters?
There was a problem hiding this comment.
done in next patch.
| /** | ||
| * @return - Returns this pool as an executor service that can be used for other asynchronous operations. | ||
| */ | ||
| ListeningExecutorService asExecutorService(); |
There was a problem hiding this comment.
I assume you want to make QueryProcessingPool compatible with ExecutorService because of ConcurrentGrouper?
jihoonson
left a comment
There was a problem hiding this comment.
I reviewed only the design of new interfaces, QueryProcessingPool and PrioritizedQueryRunnerCallable. Their design LGTM.
| package org.apache.druid.query; | ||
|
|
||
| /** | ||
| * An implementation of {@link PrioritizedCallable} that also let's caller get access to associated {@link QueryRunner} |
There was a problem hiding this comment.
| * An implementation of {@link PrioritizedCallable} that also let's caller get access to associated {@link QueryRunner} | |
| * An implementation of {@link PrioritizedCallable} that also lets caller get access to associated {@link QueryRunner} |
| /** | ||
| * @return - Returns this pool as an executor service that can be used for other asynchronous operations. | ||
| */ | ||
| ListeningExecutorService asExecutorService(); |
There was a problem hiding this comment.
I think making the processing a pool a kind of executor service makes sense, let's go with that if you agree
…1382) This PR refactors the code for QueryRunnerFactory#mergeRunners to accept a new interface called QueryProcessingPool instead of ExecutorService for concurrent execution of query runners. This interface will let custom extensions inject their own implementation for deciding which query-runner to prioritize first. The default implementation is the same as today that takes the priority of query into account. QueryProcessingPool can also be used as a regular executor service. It has a dedicated method for accepting query execution work so implementations can differentiate between regular async tasks and query execution tasks. This dedicated method also passes the QueryRunner object as part of the task information. This hook will let custom extensions carry any state from QuerySegmentWalker to QueryProcessingPool#mergeRunners which is not possible currently.
Description
This PR refactors the code for
QueryRunnerFactory#mergeRunnersto accept a new interface calledQueryProcessingPoolinstead ofExecutorServicefor concurrent execution of query runners. This interface will let custom extensions inject their own implementation for deciding which query-runner to prioritize first. The default implementation is the same as today that takes the priority of query into account.QueryProcessingPoolcan also be used as a regular executor service. It has a dedicated method for accepting query execution work so implementations can differentiate between regular async tasks and query execution tasks. This dedicated method also passes theQueryRunnerobject as part of the task information. This hook will let custom extensions carry any state fromQuerySegmentWalkertoQueryProcessingPool#mergeRunnerswhich is not possible currently.Key changed/added classes in this PR
QueryProcessingPoolQueryRunnerFactoryForwardingQueryProcessingPoolDirectQueryProcessingPoolPrioritizedQueryRunnerCallableAbstractPrioritizedQueryRunnerCallableThis PR has: