Skip to content

KAFKA-7895: fix stream-time reckoning for Suppress (2.2)#6286

Merged
bbejeck merged 6 commits intoapache:2.2from
vvcephei:KAFKA-7895-fix-suppress-2.2
Feb 20, 2019
Merged

KAFKA-7895: fix stream-time reckoning for Suppress (2.2)#6286
bbejeck merged 6 commits intoapache:2.2from
vvcephei:KAFKA-7895-fix-suppress-2.2

Conversation

@vvcephei
Copy link
Copy Markdown
Contributor

  • Add suppress to system tests
  • Move stream-time reckoning from Task into Processor

Even within a Task, different Processors have different perceptions
of time, due to record caching on stores and in suppression itself,
and in general, due to any processor logic that may hold onto
records arbitrarily and emit them later. Thanks to this, we can't rely
on the whole task existing in the same "instant" of stream-time. The
solution is for each processor node that cares about stream-time to
track it independently.

See also #6278

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@vvcephei
Copy link
Copy Markdown
Contributor Author

Ping @mjsax @bbejeck for reviews.

This is effectively a cherry-pick of #6278, with the system test changeset adapted not to rely on all the other changes to the smoke test in trunk.

All the main-code changes should be identical to #6278.

@vvcephei vvcephei changed the title KAFKA-7895: fix stream-time reckoning for Suppress KAFKA-7895: fix stream-time reckoning for Suppress (2.2) Feb 18, 2019
@vvcephei
Copy link
Copy Markdown
Contributor Author

@mjsax
Copy link
Copy Markdown
Member

mjsax commented Feb 19, 2019

@vvcephei Jenkins failed with checkstyle errors.

@mjsax mjsax added the streams label Feb 19, 2019
@vvcephei
Copy link
Copy Markdown
Contributor Author

vvcephei commented Feb 19, 2019

Goodness... I had to wrestle with the driver a bit, since I didn't backport the test for the driver, and since I'm not able to run the system tests locally.

I got the driver system test to pass:
http://testing.confluent.io/confluent-kafka-branch-builder-system-test-results/?prefix=2019-02-19--001.1550609346--vvcephei--KAFKA-7895-fix-suppress-2.2--face6a8/

Waiting on the full suite of tests:
https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2362/
Done. It looks like just the expected failures:
http://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/2019-02-19--001.1550636661--vvcephei--KAFKA-7895-fix-suppress-2.2--face6a8/report.html

Copy link
Copy Markdown
Member

@mjsax mjsax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Waiting for system tests to finish before merging.

Copy link
Copy Markdown
Member

@bbejeck bbejeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, system test failures (broker upgrade-downgrade) are unrelated and are a known issue

@vvcephei
Copy link
Copy Markdown
Contributor Author

Thanks, @mjsax and @bbejeck !

The system tests had only expected failures (updated the comment above with the results).

This PR should hopefully cherry-pick cleanly to 2.1...

@bbejeck bbejeck merged commit 7517d4e into apache:2.2 Feb 20, 2019
@vvcephei vvcephei deleted the KAFKA-7895-fix-suppress-2.2 branch February 20, 2019 16:36
@guozhangwang
Copy link
Copy Markdown
Contributor

@vvcephei I tried to cherry-pick to 2.1 but did not succeed (@bbejeck cc), because the Segments code has been largely refactored since then. Could you file another PR for 2.1?

bbejeck pushed a commit that referenced this pull request Mar 5, 2019
Even within a Task, different Processors have different perceptions
of time, due to record caching on stores and in suppression itself,
and in general, due to any processor logic that may hold onto
records arbitrarily and emit them later. Thanks to this, we can't rely
on the whole task existing in the same "instant" of stream-time. The
solution is for each processor node that cares about stream-time to
track it independently.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Bill Bejeck <bbejeck@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants