Skip to content

Conversation

@robertwb
Copy link
Contributor

DESCRIPTION HERE


Follow this checklist to help us incorporate your contribution quickly and easily:

  • Make sure there is a JIRA issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes.
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue.
  • Write a pull request description that is detailed enough to understand:
    • What the pull request does
    • Why it does it
    • How it does it
    • Why this approach
  • Each commit in the pull request should have a meaningful subject line and body.
  • Run mvn clean verify to make sure basic checks pass. A more thorough check will be performed on your pull request automatically.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

@robertwb
Copy link
Contributor Author

R: @udim

@robertwb robertwb changed the title Optimize reshuffle. Optimize reshuffle for globally windowed data. Mar 22, 2018
@udim
Copy link
Member

udim commented Mar 23, 2018

Thanks! Will review tomorrow.

_UnwindowedValues(windowed_values),
MIN_TIMESTAMP,
self.GLOBAL_WINDOW_TUPLE)
return [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that return makes more sense here.
Is there a quantifiable difference in performance between yielding a single value vs return a list with the same value in it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it turns out, no. Reverted.

else:
# The linter is confused.
# pylint: disable=abstract-class-instantiated
cls = hash(1) and _IdentityWindowFn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain this line? What's the purpose of hash(1)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise the linter was smart enough to identify _IdentityWindowFn was being instantiated, and still give the warning (which the pylint comment wasn't suppressing). Spent a good amount of time fighting the linter here--on the one hand too smart, on the other too dumb.


else:
# The linter is confused.
# pylint: disable=abstract-class-instantiated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyCharm says this class is missing to_runner_api_parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyCharm is confused too. It inherits this via the urn magic.

if windowing_saved.is_default():
# In this (common) case we can use a trivial trigger driver
# and avoid the (expensive) window param.
_globally_windowed = window.GlobalWindows.windowed_value(None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: why does this variable get a _ prefix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

# sorted order.
if windowing.is_default() and is_batch:
driver = DefaultGlobalBatchTriggerDriver()
elif (windowing.windowfn == GlobalWindows()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this branch also apply to streaming?
If so, it might be a good idea to remove the word Batch from the driver's name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@udim udim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

class DefaultGlobalBatchTriggerDriver(TriggerDriver):
"""Breaks a bundles into window (pane)s according to the default triggering.
class DiscardingGlobalTriggerDriver(TriggerDriver):
"""Groups all received values togeather.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sp: togeather

@robertwb robertwb merged commit 3737548 into apache:master Mar 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants