Skip to content

More efficient generation of ImmutableWorkerHolder from WorkerHolder.#14546

Merged
gianm merged 8 commits intoapache:masterfrom
gianm:faster-immutable-workers
Jul 13, 2023
Merged

More efficient generation of ImmutableWorkerHolder from WorkerHolder.#14546
gianm merged 8 commits intoapache:masterfrom
gianm:faster-immutable-workers

Conversation

@gianm
Copy link
Copy Markdown
Contributor

@gianm gianm commented Jul 7, 2023

Taking the work done in #12096 a little further:

  1. Applying a similar optimization to WorkerHolder (HttpRemoteTaskRunner).
    The original patch only helped with the ZkWorker (RemoteTaskRunner).

  2. Improve the ZkWorker version somewhat by avoiding multiple iterations
    through the task announcements map.

Taking the work done in apache#12096 a little further:

1) Applying a similar optimization to WorkerHolder (HttpRemoteTaskRunner).
   The original patch only helped with the ZkWorker (RemoteTaskRunner).

2) Improve the ZkWorker version somewhat by avoiding multiple iterations
   through the task announcements map.
for (final Map.Entry<String, TaskAnnouncement> entry : announcements.entrySet()) {
final TaskAnnouncement announcement = entry.getValue();

if (announcement.getStatus().isRunnable()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query: WorkerHolder was already doing this filtering but ZkWorkerHolder seems to be using all the announcements (unless the filtering happens there at some prior stage). Was it performing incorrect computations of capacity and the other parameters?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I didn't notice this behavioral difference til now. I think the filtering on isRunnable is the correct thing to do: tasks don't use slots when they aren't running, so they shouldn't be counted.

With this change, some RemoteTaskRunnerTest cases need to be adjusted, but I think that's OK. Note that in most cases, tasks get removed from the announcement set pretty quickly after they complete, so I don't expect a big practical difference in prod.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, filtering here makes sense.

@gianm gianm merged commit 450ecd6 into apache:master Jul 13, 2023
@gianm gianm deleted the faster-immutable-workers branch July 13, 2023 14:57
sergioferragut pushed a commit to sergioferragut/druid that referenced this pull request Jul 21, 2023
…apache#14546)

* More efficient generation of ImmutableWorkerHolder from WorkerHolder.

Taking the work done in apache#12096 a little further:

1) Applying a similar optimization to WorkerHolder (HttpRemoteTaskRunner).
   The original patch only helped with the ZkWorker (RemoteTaskRunner).

2) Improve the ZkWorker version somewhat by avoiding multiple iterations
   through the task announcements map.

* Pick better names and use better logic.

* Only runnable tasks.

* Fix test.

* Fix testBlacklistZKWorkers50Percent.
@LakshSingla LakshSingla added this to the 28.0 milestone Oct 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants