-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[BEAM-2806] Fix pipeline translation mode recognition #4558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| @Override | ||
| public void visitPrimitiveTransform(TransformHierarchy.Node node) { | ||
| AppliedPTransform<?, ?, ?> appliedPTransform = node.toAppliedPTransform(pipeline); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tgroh Is there a better way of getting the AppliedPTransform from the node here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove the pipeline field and call getPipeline(), which is available while traversing the pipeline if you're using PipelineVisitor.Defaults as the root of your visitor hierarchy. Otherwise, this is basically exactly how I'd do it.
| } | ||
| } | ||
|
|
||
| public static IsBounded.Enum isBounded(AppliedPTransform<?, ?, ?> transform) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tgroh Is this a good place to put such a method or is there already an easier method for determining whether a PTransform/PCollection is unbounded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is precisely sourceIsBounded for all well formed ReadPayloads (one method up)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, I thought that as well but it takes a ReadPayload and when traversing the pipeline we don't know which transform is a read.
Another alternative would be
try {
rawSource = ReadTranslation.unboundedSourceFromTransform(
(AppliedPTransform<PBegin, PCollection<T>, PTransform<PBegin, PCollection<T>>>)
node.toAppliedPTransform(getPipeline()));
// we have an unbounded source
} catch (IOException e) {
// no unbounded source
}
but I'm not sure I like the exceptions-as-control-flow thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah; I see. If we don't know that it's a read transform I'm generically opposed to routing it through ReadTranslation, plus I also don't really like the use of exceptions as control flow.
Each PCollection contains its boundedness, so you can just inspect the outputs of the transform (which are labelled as PValues, but I believe are all PCollections at this point; maybe worth the check regardless). They have the PCollection.isBounded() method, in both proto and java forms
|
retest this please |
1 similar comment
|
retest this please |
|
|
|
I run a test locally and it seems the error is still there, and the test doesn't cover |
|
@xumingmin Do you have that test as a nicely isolated thing that I could run so that we can debug this? |
|
@aljoscha it should be a false alarm. I move my test code(see below) to class |
|
Thanks a lot for working on this and coming up with the better solution. 👍 I'm merging... |
Follow this checklist to help us incorporate your contribution quickly and easily:
[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue.mvn clean verifyto make sure basic checks pass. A more thorough check will be performed on your pull request automatically.