-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[BEAM-3566] Replace apply_* hooks in DirectRunner with PTransformOverrides #4529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -22,10 +22,13 @@ | |
|
|
||
| import hamcrest as hc | ||
|
|
||
| import apache_beam as beam | ||
| from apache_beam.io.gcp.pubsub import ReadStringsFromPubSub | ||
| from apache_beam.io.gcp.pubsub import WriteStringsToPubSub | ||
| from apache_beam.io.gcp.pubsub import _PubSubPayloadSink | ||
| from apache_beam.io.gcp.pubsub import _PubSubPayloadSource | ||
| from apache_beam.options.pipeline_options import StandardOptions | ||
| from apache_beam.runners.direct.direct_runner import _get_transform_overrides | ||
| from apache_beam.testing.test_pipeline import TestPipeline | ||
| from apache_beam.transforms.display import DisplayData | ||
| from apache_beam.transforms.display_test import DisplayDataItemMatcher | ||
|
|
@@ -40,28 +43,51 @@ | |
|
|
||
|
|
||
| @unittest.skipIf(pubsub is None, 'GCP dependencies are not installed') | ||
| class TestReadStringsFromPubSub(unittest.TestCase): | ||
| class TestReadStringsFromPubSubOverride(unittest.TestCase): | ||
| def test_expand_with_topic(self): | ||
| p = TestPipeline() | ||
| pcoll = p | ReadStringsFromPubSub('projects/fakeprj/topics/a_topic', | ||
| None, 'a_label') | ||
| # Ensure that the output type is str | ||
| p.options.view_as(StandardOptions).streaming = True | ||
| pcoll = (p | ||
| | ReadStringsFromPubSub('projects/fakeprj/topics/a_topic', | ||
| None, 'a_label') | ||
| | beam.Map(lambda x: x)) | ||
| # Ensure that the output type is str. | ||
| self.assertEqual(unicode, pcoll.element_type) | ||
|
|
||
| # Apply the necessary PTransformOverrides. | ||
| overrides = _get_transform_overrides(p.options) | ||
| p.replace_all(overrides) | ||
|
|
||
| # Note that the direct output of ReadStringsFromPubSub will be replaced | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Thanks, and I agree, but the issue is that there really isn't anything to test. The test here isn't a test of the transform; rather, it was (and still is) testing the behavior of replacing the transform with the correct DirectRunner replacement. The transform itself is just a wrapper with runner-specific overrides. I changed the test name to reflect the intended target of the test.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, I meant https://beam.apache.org/contribute/ptransform-style-guide/#testing-transform-construction-and-validation If this is about the direct runner, we should put it into the direct runner tests. Best is if we could create a mock/in memory PubSub and make sure this works end-to-end (on any runner).
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But, as mentioned, fixing these existing tests should not block this PR. Please file a JIRA.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, filed https://issues.apache.org/jira/browse/BEAM-3619. |
||
| # by a PTransformOverride, so we use a no-op Map. | ||
| read_transform = pcoll.producer.inputs[0].producer.transform | ||
|
|
||
| # Ensure that the properties passed through correctly | ||
| source = pcoll.producer.transform._source | ||
| source = read_transform._source | ||
| self.assertEqual('a_topic', source.topic_name) | ||
| self.assertEqual('a_label', source.id_label) | ||
|
|
||
| def test_expand_with_subscription(self): | ||
| p = TestPipeline() | ||
| pcoll = p | ReadStringsFromPubSub( | ||
| None, 'projects/fakeprj/subscriptions/a_subscription', 'a_label') | ||
| p.options.view_as(StandardOptions).streaming = True | ||
| pcoll = (p | ||
| | ReadStringsFromPubSub( | ||
| None, 'projects/fakeprj/subscriptions/a_subscription', | ||
| 'a_label') | ||
| | beam.Map(lambda x: x)) | ||
| # Ensure that the output type is str | ||
| self.assertEqual(unicode, pcoll.element_type) | ||
|
|
||
| # Apply the necessary PTransformOverrides. | ||
| overrides = _get_transform_overrides(p.options) | ||
| p.replace_all(overrides) | ||
|
|
||
| # Note that the direct output of ReadStringsFromPubSub will be replaced | ||
| # by a PTransformOverride, so we use a no-op Map. | ||
| read_transform = pcoll.producer.inputs[0].producer.transform | ||
|
|
||
| # Ensure that the properties passed through correctly | ||
| source = pcoll.producer.transform._source | ||
| source = read_transform._source | ||
| self.assertEqual('a_subscription', source.subscription_name) | ||
| self.assertEqual('a_label', source.id_label) | ||
|
|
||
|
|
@@ -80,12 +106,22 @@ def test_expand_with_both_topic_and_subscription(self): | |
| class TestWriteStringsToPubSub(unittest.TestCase): | ||
| def test_expand(self): | ||
| p = TestPipeline() | ||
| pdone = (p | ||
| p.options.view_as(StandardOptions).streaming = True | ||
| pcoll = (p | ||
| | ReadStringsFromPubSub('projects/fakeprj/topics/baz') | ||
| | WriteStringsToPubSub('projects/fakeprj/topics/a_topic')) | ||
| | WriteStringsToPubSub('projects/fakeprj/topics/a_topic') | ||
| | beam.Map(lambda x: x)) | ||
|
|
||
| # Apply the necessary PTransformOverrides. | ||
| overrides = _get_transform_overrides(p.options) | ||
| p.replace_all(overrides) | ||
|
|
||
| # Note that the direct output of ReadStringsFromPubSub will be replaced | ||
| # by a PTransformOverride, so we use a no-op Map. | ||
| write_transform = pcoll.producer.inputs[0].producer.transform | ||
|
|
||
| # Ensure that the properties passed through correctly | ||
| self.assertEqual('a_topic', pdone.producer.transform.dofn.topic_name) | ||
| self.assertEqual('a_topic', write_transform.dofn.topic_name) | ||
|
|
||
|
|
||
| @unittest.skipIf(pubsub is None, 'GCP dependencies are not installed') | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests seem rather brittle. Is there a better way to test this transform application than grabbing the internal source and verifying a couple of properties on it. https://beam.apache.org/documentation/pipelines/test-your-pipeline/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping on this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've addressed this here: #4529 (comment)