Skip to content

Conversation

@tgroh
Copy link
Member

@tgroh tgroh commented Mar 22, 2016

The excess scheduling of known-empty bundles can consume excessive
resources, especially with the default CachedThreadPool executor service.

Removes an unnecessary synchronized block (the map is already
thread-safe)

The excess scheduling of known-empty bundles can consume excessive
resources.
@tgroh
Copy link
Member Author

tgroh commented Mar 22, 2016

R: @bjchambers

}
fireTimers();
mightNeedMoreWork();
if (!fireTimers()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment here, since it's not immediately clear what "fireTimers" returns. Eg:

if (!fireTimers()) {
  // If any timers fired, then we may have more work.
  ...
}

(Or is it if there are unfired timers? etc.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

tgroh added 2 commits March 22, 2016 14:01
Add documentation around completion and maybe adding more work.

Refactor mightNeedMoreWork to helper function the "can't make progress"
check

Rename mightNeedMoreWork -> addWorkIfNecessary
Comment on why we might decide that work can still be done.

TODO: test many many times
mightNeedMoreWork();
if (!fireTimers()) {
// If any timers have fired, they will add more work; We don't need to add more
addWorkIfNecessary();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think about this, I'd propose:

boolean timersFired = fireTimers();
addWorkIfNecessary(timersFired);

This replaces the comment on what fireTimers() returns by assigning a value, and it moves all the logic on when to fire timers into addWorkIfNecessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

tgroh added 2 commits March 23, 2016 13:10
Slightly cleaner method calls
add tests for isDone.

containsUnboundedPCollection should look at outputs rather than inputs
if (!watermarkManager
.getWatermarks(transform)
.getOutputWatermark()
.equals(BoundedWindow.TIMESTAMP_MAX_VALUE)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to risk of NPE, its generally better to do:
CONSTANT.equals(expression)
rather than
expression.equals(CONSTANT)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched this to isBefore, which is much easier to read in this order.

NPEs should never occur from the watermark manager, as all of the watermarks at creation time. Switching the order decreases readability and also isn't actually particularly good protection against NPEs in this context.

Improve isDone implementation
@bjchambers
Copy link
Contributor

LGTM.

@asfgit asfgit closed this in 9247ad7 Mar 23, 2016
davorbonaci added a commit to GoogleCloudPlatform/DataflowJavaSDK that referenced this pull request Mar 25, 2016
hengfengli referenced this pull request in hengfengli/beam Mar 21, 2022
alnzng pushed a commit to alnzng/beam that referenced this pull request Jan 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants