Skip to content

Conversation

@westonpace
Copy link
Member

@westonpace westonpace commented May 6, 2021

Previously a future's callbacks would always run synchronously, either as part of Future::MarkFinished or as part of Future::AddCallback. Executor::Transfer made it possible to schedule continuations on a new thread but it would only take effect if the transferred future's callbacks were added before the source future finished. There are times when the desired behavior is to spawn a new thread task even if the source future is finished already.

This PR adds three scheduling options:

  • Never - The default (and existing) behavior, never spawn a new task
  • IfUnfinished - Spawn a new task only if the future isn't already finished when the callback is added
  • Always - Always spawn a new task, on both finished and unfinished futures, regardless of destination thread pool idleness.

The Never option doesn't make any sense for transferring so the transfer only has two choices (always or if unfinished).

@github-actions
Copy link

github-actions bot commented May 6, 2021

@westonpace
Copy link
Member Author

CC @pitrou I think you referenced this in your latest execution engine PR.

@westonpace westonpace marked this pull request as draft May 6, 2021 09:35
@pitrou
Copy link
Member

pitrou commented May 10, 2021

I'm not sure why you're suggesting to add so much sophistication. To me there are only two interesting options: "always" and "if unfinished". So we could have Transfer (transfer always) vs. TransferUnfinished.

@westonpace westonpace force-pushed the feature/ARROW-12560--c-investigate-utilizing-aggressive-thread-task branch from c95244a to e992bb6 Compare May 24, 2021 22:05
@westonpace
Copy link
Member Author

Since I'm working on work stealing at the thread pool level I agree that idle is no longer needed. I've cleaned this up and rebased. It's much simpler than it was before.

@westonpace westonpace marked this pull request as ready for review May 25, 2021 00:33
@westonpace
Copy link
Member Author

Also, I ran into a bit of trouble with the future callback's weak reference to the future. Before we could just assume it was valid since all callbacks were completed before MarkFinished was completed. Now, it is possible for a future to schedule a callback and that callback to far outlive the call to MarkFinished. So now when a callback is scheduled (run on an executor) we make a copy of the FutureImpl's shared_ptr to keep it alive until that callback has a chance to run.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. This looks good to me on the principle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nullptr

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why "a copy"? It's not clear to me where a copy is being made.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The copy is a few lines down when we call shared_from_this. I'll move the comment and make it more explicit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's only the shared_ptr copy, then I'm not sure it's worth mentioning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the more important thing is that we are intentionally extending the lifetime of the future. I reworded the comment a bit and dropped the "copy". I can always remove it if we want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The coding conventions prohibit passing mutable lrefs. You could make this a CallbackRecord&&, for example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid using ALL_CAPS names, because of potential clashes with macros (this is a common issue with Windows headers, unfortunately).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, technically the style guide prefers kAlways but I see Always used more often in Arrow. Although some of the gandiva code uses kAlways. (https://google.github.io/styleguide/cppguide.html#Enumerator_Names). Any preference?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always sounds fine to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I'll make a PR to add this to the style guide docs as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit weird to have this in a private test file, and the mock executor in a .h.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My rationale was only that DelayedExecutor is only used in future_test.cc while MockExecutor is used in future_test.cc and thread_pool_test.cc but I see your point. I'll move this into test_common.h.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NEVER is never tested?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, but it would probably be nicer to be able to spell this as TransferAlways(fut).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@westonpace
Copy link
Member Author

@pitrou Don't worry about the delay, I've been plenty busy elsewhere. I have a just a few follow-up questions and then I'll make the changes.

@westonpace westonpace force-pushed the feature/ARROW-12560--c-investigate-utilizing-aggressive-thread-task branch 3 times, most recently from c038210 to a4bcf0f Compare June 4, 2021 22:41
@westonpace
Copy link
Member Author

Ok, I've addressed the comments and this is ready for review again.

@westonpace westonpace requested a review from pitrou June 5, 2021 02:37
@pitrou pitrou changed the title ARROW-12560: [C++] Investigate utilizing aggressive thread task creation when adding callback to finished future ARROW-12560: [C++] Add scheduling option for Future callbacks Jun 7, 2021
@pitrou pitrou force-pushed the feature/ARROW-12560--c-investigate-utilizing-aggressive-thread-task branch from be5aa79 to 747c498 Compare June 7, 2021 13:30
@pitrou
Copy link
Member

pitrou commented Jun 7, 2021

Thanks for the update @westonpace . I'll merge once CI passes.

@pitrou pitrou closed this in e7b6c4a Jun 7, 2021
@westonpace westonpace deleted the feature/ARROW-12560--c-investigate-utilizing-aggressive-thread-task branch January 6, 2022 08:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants