Skip to content

Conversation

@apilloud
Copy link
Member

@apilloud apilloud commented Mar 7, 2018

The caching of metric results in the Dataflow runner makes some bad assumptions. It assumes queryMetrics will always be called with the same filter which leads to the wrong results in common cases, including the nexmark benchmark.


Follow this checklist to help us incorporate your contribution quickly and easily:

  • Make sure there is a JIRA issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes.
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue.
  • Write a pull request description that is detailed enough to understand:
    • What the pull request does
    • Why it does it
    • How it does it
    • Why this approach
  • Each commit in the pull request should have a meaningful subject line and body.
  • Run mvn clean verify to make sure basic checks pass. A more thorough check will be performed on your pull request automatically.

@apilloud
Copy link
Member Author

apilloud commented Mar 7, 2018

@bjchambers I hear you are an expert on Beam metrics and might be able to review this.

@apilloud
Copy link
Member Author

apilloud commented Mar 8, 2018

Run Java PreCommit

@bjchambers
Copy link
Contributor

If you look at the query being sent to the Dataflow service, it does not include the filters. The original intention, I believe, was to cache the unfiltered results from Dataflow, and perform filtering each time metrics are queried. This would give the correct results, while also preventing re-querying for unchanged metrics.

@apilloud apilloud changed the title [BEAM-3802] Remove broken cachedMetricResults [BEAM-3802] Move metrics caching up a level Mar 8, 2018
@apilloud
Copy link
Member Author

apilloud commented Mar 8, 2018

My original concern was that metrics were eventually consistent. But if that was the case I would be having lots of other problems. I've changed this to move the caching up a level.

@bjchambers
Copy link
Contributor

LGTM

@apilloud
Copy link
Member Author

apilloud commented Mar 9, 2018

Tests have passed.

@kennknowles kennknowles merged commit 66d6876 into apache:master Mar 9, 2018
@apilloud apilloud deleted the metrics branch March 9, 2018 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants