-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[BEAM-306] Make sure PubsubUnboundedSource works with the InProcessPipelineRunner #388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
LGTM |
|
Why not just fix PubsubIO to relinquish ownership? (and on the other side, accept any valid checkpoint) |
|
Because
On Wed, May 25, 2016 at 11:08 AM, Kenn Knowles notifications@github.com
|
|
Requiring that the checkpoint is finalized makes sense. Which sort of implies that finalize should return the true checkpoint... |
|
I should more fully state what I mean here: I agree with the change to the Longer term, the direct runner should do all of these:
If we temporarily have to choose between 2 and 3, then 3 is better. It exercises the important deserialization code path. If deserialization is correct, then any divergence between 2 and 3 is a bug in the source, and missing those bugs is just the short-term cost of switching to 3. |
…restore. Allows same checkpoint object to be passed to new reader without a serialize/deserialize step.
|
PTAL |
|
LGTM. Will merge when green. |
Backport Beam PR-835 "Fix NPE in BigQueryIO.TransformingReader when it contains an unsplittable reader."
Explicitly define section id due to kramdown id generation changes Update Gemfile.lock
* fix: Fixes apache#346 by reseeding for each auto id on py3.6
R: @dhalperi @tgroh
The PubsubUnboundendSource implementation has an assertion to confirm the checkpoint from which a fresh reader is instantiated has come via deserialization from an earlier finalized checkpoint. The in-process runner was reusing the checkpoint object directly, so the assertion failed. This adds the serialize/deserialize to the in-process runner, which I believe is the best solution since other UnboundedSources may be caught by the same issue. It also forces the user to exercise their checkpoint coder.