[BEAM-7738] Add external transform support to PubsubIO #9268

chadrik · 2019-08-05T22:17:33Z

Add support for using PubSubIO as an external transform, so that it works from python on portable runners like Flink.

Post-Commit Tests Status (on master branch)

Lang	Apex	Dataflow	Gearpump	Samza
Go	---	---	---	---
Java
Python	---		---	---

Pre-Commit Tests Status (on master branch)

---	Java	Python	Go	Website
Non-portable
Portable	---		---	---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

chadrik · 2019-08-05T22:21:37Z

.../java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java

+          SimpleFunction<PubsubMessage, T> parseFn =
+              (SimpleFunction<PubsubMessage, T>) new IdentityMessageFn();
+          setParseFn(parseFn);
+          // FIXME: call setCoder(). need to use PubsubMessage proto to be compatible with python


In the case that an external sdk has requested needsAttributes, this transform needs to produce PubsubMessage instances rather than byte arrays. There's a wrinkle here: python deserializes PubsubMessage using protobufs, whereas Java encodes with PubsubMessageWithAttributesCoder.

I believe we want to use protobufs from Java. Can I get confirmation of that, please?

I serialized the PubsubMessage using protobufs. Since there's no cross-language coder for PubsubMessage, and I assumed it would be overreach to add one, I used the bytes coder and then handled converting to and from protobufs in code that lives close to the transforms.

chadrik · 2019-08-05T22:22:35Z

sdks/python/apache_beam/io/external/gcp/pubsub.py

+      args['id_label'] = _encode_str(self.id_label)
+
+    # FIXME: how do we encode a bool so that Java can decode it?
+    # args['with_attributes'] = _encode_bool(self.with_attributes)


What's the right way to encode a bool so that Java can decode it correctly?

I encoded as int and handled the cast in java

chadrik · 2019-08-05T22:24:21Z

R: @robertwb
R: @mxm
R: @lukecwik
R: @chamikaramj

mxm · 2019-08-13T16:40:46Z

@chadrik This is awesome. Thanks a lot for your work! Let's rebase this once #9098 is in. I'll have a look at this very soon.

chadrik · 2019-08-13T17:25:55Z

Thanks @mxm. This is actually not yet working yet, and I'm getting some mysterious errors deep within flink wrt serialization. I'm working on writing tests and gathering as much info as I can so that I can report back here. So far I'm pretty perplexed, but I'm new to Beam, Flink, and Java, so not a big surprise! Once I have more info, it would be great to have your input.

chadrik · 2019-08-16T23:39:09Z

Here's some info on where I am with this. I could really use some help to push this over the finish line.

The expansion service runs correctly, sends the expanded transforms back to python, but the job fails inside Java on Flink because it's trying to use the incorrect serializer. There's a good chance that I'm overlooking something very obvious.

Here's the stack trace:

2019-08-16 16:04:58,297 INFO  org.apache.flink.runtime.taskmanager.Task                     - Source: PubSubInflow/ExternalTransform(beam:external:java:pubsub:read:v1)/PubsubUnboundedSource/Read(PubsubSource) (1/1) (93e578d8ae737c7502685cd978413a98) switched from RUNNING to FAILED.
java.lang.RuntimeException: org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage cannot be cast to [B
	at org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:110)
	at org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:89)
	at org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:45)
	at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
	at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
	at org.apache.flink.streaming.api.operators.StreamSourceContexts$ManualWatermarkContext.processAndCollect(StreamSourceContexts.java:305)
	at org.apache.flink.streaming.api.operators.StreamSourceContexts$WatermarkContext.collect(StreamSourceContexts.java:394)
	at org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.emitElement(UnboundedSourceWrapper.java:342)
	at org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.run(UnboundedSourceWrapper.java:268)
	at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:93)
	at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:57)
	at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:97)
	at org.apache.flink.streaming.runtime.tasks.StoppableSourceStreamTask.run(StoppableSourceStreamTask.java:45)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassCastException: org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage cannot be cast to [B
	at org.apache.beam.sdk.coders.ByteArrayCoder.encode(ByteArrayCoder.java:41)
	at org.apache.beam.sdk.coders.LengthPrefixCoder.encode(LengthPrefixCoder.java:56)
	at org.apache.beam.sdk.values.ValueWithRecordId$ValueWithRecordIdCoder.encode(ValueWithRecordId.java:105)
	at org.apache.beam.sdk.values.ValueWithRecordId$ValueWithRecordIdCoder.encode(ValueWithRecordId.java:81)
	at org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:578)
	at org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:569)
	at org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:529)
	at org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.serialize(CoderTypeSerializer.java:87)
	at org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:175)
	at org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:46)
	at org.apache.flink.runtime.plugable.SerializationDelegate.write(SerializationDelegate.java:54)
	at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.serializeRecord(SpanningRecordSerializer.java:78)
	at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:160)
	at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:128)
	at org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:107)
	... 15 more

Here's the serializer that CoderTypeSerializer.serialize is using before the exception:

WindowedValue$FullWindowedValueCoder(
  ValueWithRecordId$ValueWithRecordIdCoder(
    LengthPrefixCoder(
      ByteArrayCoder)
  ),
  GlobalWindow$Coder
)

Here's the value that's being serialized:

TimestampedValueInGlobalWindow{
  value=ValueWithRecordId{
    id=[54, 56, 54, 48, 48, 52, 48, 49, 49, 57, 55, 54, 52, 53, 48], 
    value=org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage@64d3b2a
  }, 
  timestamp=2019-08-16T20:04:29.230Z, 
  pane=PaneInfo.NO_FIRING
}

Obviously ByteArrayCoder is not the right serializer for PubsubMessage.

Note that I get the same error whether with_attributes is enabled or disabled.

So why is ByteArrayCoder being used?

The graph that's generated after expansion looks right to me. Here's a table of the primitive transforms and their PCollections (note the last row is a python pardo).

transform name	transform type	out collection	out collection coder
external_1root/PubsubUnboundedSource/Read(PubsubSource)	beam:transform:read:v1	PubsubMessage	PubsubMessageWithAttributesCoder
external_1root/PubsubUnboundedSource/PubsubUnboundedSource.Stats/ParMultiDo(Stats)	beam:transform:pardo:v1 of StatsFn	PubsubMessage	PubsubMessageWithAttributesCoder
external_1root/MapElements/Map/ParMultiDo(Anonymous)	beam:transform:pardo:v1 of ParsePayloadAsPubsubMessageProto	byte[]	ByteArrayCoder
ref_AppliedPTransform_PubSubInflow/Map(_from_proto_str)_4	beam:transform:pardo:v1 of _from_proto_str	PubsubMessage	Pickle

So as far as the Pipeline definition is concerned, it seems like the output of Read(PubsubSource) is properly associated with the PubsubMessageWithAttributesCoder, but when when the job runs it's using ByteArrayCoder. So maybe somehow the wires are getting crossed with ParMultiDo(Anonymous), which uses ByteArrayCoder.

Any ideas?

The last thing that's important to note is that I needed PubsubIO.Read to output a data type that is compatible with Python, so I wrote a custom parseFn that converts each message to a protobuf byte array. So the PCollection that bridges the divide between Java and Python uses a BytesCoder, but the transforms on either side do the work of converting to and from protobuf representation of a PubsubMessage.

chadrik · 2019-08-17T17:58:48Z

Ok, I now know how the PubsubMessageWithAttributesCoder is getting swapped with a BytesCoder, but it turns out it's intentional and I don't know why.

In FlinkStreamingPortablePipelineTranslator.translateRead when the coder is initiated it is passed through LengthPrefixUnknownCoders.addLengthPrefixedCoder which silently replaces all non-model coders with LengthPrefixCoder(ByteArrayCoder). This seems like the sort of thing that should print a warning, since I assume that a broken pipeline is a likely outcome.

I'm unclear why this coder swap is necessary. The java part of this pipeline is a Read -> ParDo -> ParDo, shouldn't this segment be able to utilize java-only coders (i.e. non-model coder)?

What's the proper solution here? The last java ParDo is the one that's ensuring we have a byte array for sending to python, but evidently this needs to be happening in the Read itself?

chadrik · 2019-08-30T19:38:33Z

PubsubIO is officially working on Flink!

Some notes:

In order for this to work from python, this PR needs to be merged: [BEAM-8088] Track PCollection boundedness in python sdk #9426
This PR is not based on my pending "user-friendly external transform" PR: [7746] Create a more user friendly external transform API #9098. There are still some issues that I need to work out with @mxm on that one, so I think this should go first and I'll update that one afterward.

There's one giant caveat: this actually does not work unless you have this special commit: chadrik@d12b990. It turns out the same is currently true for KafkaIO external transform support, which came as quite a surprise. The Jira issue for this problem is here: https://jira.apache.org/jira/browse/BEAM-7870. I'd love to see some movement on this. Happy to help where I can!

chadrik · 2019-09-05T16:59:42Z

I think this PR is ready to merge, but I could use a little help understanding how to resolve the test failures.

Java PreCommit has no test failures, so I'm not sure why it's marked as failed here. Is it because of the warnings? Jenkins seems to imply that anything over 12 warnings is an error, but I don't see any warnings in files that I changed:
https://builds.apache.org/job/beam_PreCommit_Java_Commit/7551/

Likewise Python PreCommit list no failures in jenkins, but is marked as failed here.

And Portable_Python PreCommit just seems to have timed out.

chadrik · 2019-09-05T16:59:56Z

Run Portable_Python PreCommit

chadrik · 2019-09-05T21:07:25Z

Run Python PreCommit

chadrik · 2019-09-10T01:37:14Z

Hi. Still hoping for a little help bringing this to a close.

mxm · 2019-09-24T16:44:37Z

Build results have expired: https://builds.apache.org/job/beam_PreCommit_Java_Commit/7551/

mxm · 2019-09-24T16:44:43Z

Retest this please

mxm · 2019-09-24T16:45:27Z

@chadrik Does this have to be rebased now that #9098 has been merged?

chadrik · 2019-09-24T16:48:07Z

Does this have to be rebased now that #9098 has been merged?

It will work as is, but it should be converted to the new API to provide more real-world examples. I'll do that today.

If you could help me diagnose the Java test failure it would be much appreciated!

mxm · 2019-09-25T15:53:53Z

Will take a look ASAP, on a low time budget at the moment :)

chamikaramj · 2019-09-26T18:23:09Z

Seems like the Java PreCommit failure is for the new test ?

Stacktrace is:
org.apache.beam.sdk.io.gcp.pubsub.PubsubIOExternalTest > testConstructPubsubRead FAILED
java.lang.RuntimeException at PubsubIOExternalTest.java:97
Caused by: java.lang.RuntimeException at PubsubIOExternalTest.java:97
Caused by: java.lang.reflect.InvocationTargetException at PubsubIOExternalTest.java:97
Caused by: java.lang.IllegalStateException at PubsubIOExternalTest.java:97

chamikaramj · 2019-09-26T18:27:59Z

Python failure seems to be a docs issue (:sdks:python:test-suites:tox:py2:docs'.)

Could be a flake. Retrying.

chamikaramj · 2019-09-26T18:28:05Z

Run Python PreCommit

mxm · 2019-09-27T08:44:52Z

@chadrik Seems like you fixed the issue. Java tests are passing now: https://builds.apache.org/job/beam_PreCommit_Java_Commit/7889/ The one test failure is unrelated to your changes:

org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundleStateful

chadrik · 2019-10-02T23:20:23Z

This is now waiting on the BooleanCoder in python

chadrik · 2019-10-09T00:09:23Z

Run Java PreCommit

mxm · 2019-10-16T10:13:23Z

Any update here?

chadrik · 2019-10-16T15:40:14Z

We are doing some final testing in production today and then I’ll hopefully give you the thumbs up.

…

-chad

On Wed, Oct 16, 2019 at 3:13 AM Maximilian Michels ***@***.***> wrote: Any update here? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#9268?email_source=notifications&email_token=AAAPOE7GFKHITKQ6QNXCTADQO3SNBA5CNFSM4IJPKC42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBL6KBI#issuecomment-542631173>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAPOEZENHLSW67DPSEA463QO3SNBANCNFSM4IJPKC4Q> .

chadrik · 2019-10-20T22:48:33Z

Run Python PreCommit

chadrik · 2019-10-21T00:36:26Z

Run Python PreCommit

chadrik · 2019-10-21T02:44:16Z

Run Python PreCommit

chadrik · 2019-10-21T16:20:45Z

Run Python PreCommit

chadrik · 2019-10-21T19:41:29Z

@mxm This is tested on prem and ready to go! (with the caveat that the special commit to add dependencies is still required for end users, as is the case with kafka: chadrik/beam@d12b990)

mxm

Very nice. This is ready from my side, but I would follow-up with doc strings in the Python API to list the limitations. This should also be done with ReadFromKafka/WriteToKafka. Perhaps we should even add the additional commits as part of Beam, at least until we have worked around the coders problems?

sdks/python/apache_beam/io/external/gcp/pubsub.py

chadrik · 2019-10-22T17:08:39Z

Perhaps we should even add the additional commits as part of Beam, at least until we have worked around the coders problems?

I think it would be great for this to work out of the box, but I'm not sure what the ramifications of adding those commits would be. They don't just modify the standard coders, they also change the standard beam java docker containers by adding additional dependencies. Is it safe to add those even if people aren't using pubsub/kafka? Is the only downside larger container sizes?

@TheNeuralBit How close are we to having schema coders ready to use? IIUC, we should be able to register a converter between Row -> PubsubMessage, right? What's the mechanism for that? It would be great to put that to use here as soon as the schema coders are ready.

TheNeuralBit · 2019-10-22T18:26:53Z

How close are we to having schema coders ready to use? IIUC, we should be able to register a converter between Row -> PubsubMessage, right? What's the mechanism for that? It would be great to put that to use here as soon as the schema coders are ready.

@robertwb gave #9188 a LGTM, I just need to get CI passing and I think we can merge it. I'll work on that now.

Are you thinking you'd use beam:coder:row:v1 as the interface for the external transform, and the Java ExternalTransform implementations would handle the conversion of Row to/from PubsubMessage? There's no trivial way to register a converter between Row and PubsubMessage since the latter isn't structured, but of course on the Java side we could have code to serialize the Row to a variety of formats to put in the PubsubMessage payload: Avro, JSON, or the row serialization format itself (although I'm not sure we'd want to encourage using that outside of Beam), would be pretty simple to add. Maybe the format to use could be part of the external transform payload.

chadrik · 2019-10-22T18:54:15Z

Are you thinking you'd use beam:coder:row:v1 as the interface for the external transform, and the Java ExternalTransform implementations would handle the conversion of Row to/from PubsubMessage?

There are two places that I see beam:coder:row:v1 being useful:

as a way to declare the construction interface of an external transform, and encode its values. A schema coder would replace the configuration mapping in pipeline.ExternalConfiguration.ExternalConfigurationPayload proto.
as a coder for structured elements that are exchanged between sdks

Right now I'm mostly thinking about the latter, which is when PubsubMessage comes into play.

There's no trivial way to register a converter between Row and PubsubMessage since the latter isn't structured,

Maybe I'm thinking about this wrong, but I think the PubsubMessage is structured:

public class PubsubMessage {

  private byte[] message;
  private Map<String, String> attributes;
  private String messageId;

  /** Returns the main PubSub message. */
  public byte[] getPayload() {
    return message;
  }

  /** Returns the full map of attributes. This is an unmodifiable map. */
  public Map<String, String> getAttributeMap() {
    return attributes;
  }

  /** Returns the messageId of the message populated by Cloud Pub/Sub. */
  @Nullable
  public String getMessageId() {
    return messageId;
  }

I'm not a Java expert by any means, but this seems like a type that would work with AutoValue, we just need to rename message to payload and attributes to attributeMap.

What are the requirements for registering a Row converter?

but of course on the Java side we could have code to serialize the Row to a variety of formats to put in the PubsubMessage payload: Avro, JSON, or the row serialization format itself (although I'm not sure we'd want to encourage using that outside of Beam), would be pretty simple to add.

I think the payload is not a concern when it comes to portability of external transforms: it gets encoded/decoded by another transform, not PubsubRead/Write. We can just assume that's a byte array.

My grasp on the Java side is a bit tenuous, so I'd like for @mxm to confirm or deny what I've written here.

TheNeuralBit · 2019-10-22T20:19:16Z

Right now I'm mostly thinking about the latter

Agreed, that's what I'm thinking about too.

Maybe I'm thinking about this wrong, but I think the PubsubMessage is structured:

Ah ok, fair. I was referring specifically to the structure (or lack thereof) of the byte array payload, but you're right the (Python SDK) user can handle creating a byte array themselves, and the row coder can just encode {byte[] payload, Map<String, String> attributes, String messageId}

What are the requirements for registering a Row converter?

Java

There are a variety of ways to do it. If it were a class that we had control over we could use the DefaultSchema annotation, as long as one of the included SchemaProvider implementations would work (I think JavaBeanSchema is the closest but wouldn't work because PubsubMessage doesn't have setters). I think what we'd want to do here is just implement a SchemaProvider and a SchemaProviderRegistrar for PubsubMessage and include it in Beam.

@reuvenlax may have a better suggestion.

Python

With my PR I think it could look like:

# this is py3 syntax for clarity, but we'd probably
# need to use the TypedDict('PubsubMessage', ...) version
class PubsubMessage(TypedDict):
   message: ByteString
   attributes: Mapping[unicode, unicode]
   messageId: unicode

coders.registry.register_coder(PubsubMessage, coders.RowCoder)

pcoll
  | 'make some messages' >> beam.Map(makeMessage).with_output_types(PubsubMessage)
  | 'write to pubsub' >> beam.io.WriteToPubsub(project, topic) # or something

chadrik · 2019-10-22T21:54:25Z

If it were a class that we had control over we could use the DefaultSchema annotation, as long as one of the included SchemaProvider implementations would work (I think JavaBeanSchema is the closest but wouldn't work because PubsubMessage doesn't have setters).

It may reduce our overall technical debt if we just implement the full set of setters and getters on PubsubMessage and use DefaultSchema. It doesn't seem like it would be a bad thing. Who would have an opinion on that?

TheNeuralBit · 2019-10-22T23:59:09Z

D'oh my bad, we do have control over PubsubMessage 🤦‍♂️ I assumed it was part of the pubsub client library. Yeah I vote we use DefaultSchema with either JavaBeanSchema or JavaFieldSchema, whichever works with fewer changes.

chadrik · 2019-10-24T15:39:28Z

@mxm I marked kafak and pubsub external transforms as external. I think this is ready to merge.

chadrik · 2019-10-24T15:58:54Z

@TheNeuralBit On the python side, as with the Java SDK, there is a custom PubsubMessage class (take a look in apache_beam.io.gcp.pubsub). The main thing it provides is methods for converting to/from protobuf.

In your example above, you registered the TypedDict sub-class: will there be a similar system in python for converting between a row type to a custom type, like apache_beam.io.gcp.pubsub.PubsubMessage?

TheNeuralBit · 2019-10-24T17:35:12Z

Will there be a similar system in python for converting between a row type to a custom type

It hasn't been designed, but that's definitely something I'd like to support. Would we actually need the PubsubMessage protobuf conversion methods for the external transform version of pubsubio/kafkaio, or are you just thinking it would be good to use the same type for consistency? I'm just wondering if we can get away with a simple TypedDict sub-class for now

chamikaramj · 2019-10-24T17:45:00Z

sdks/python/apache_beam/io/external/gcp/pubsub.py

+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+
+ReadFromPubsubSchema = typing.NamedTuple(


Doesn't have to be done in this PR, but in the future it will be great if we just move external PubSub and Kafka transforms to io/gcp/pubsub.py and io/kafka.py and configure/branch based on pipeline options.

I don't love that idea. The two sets of xforms are do not have the same arguments, in part because of the need for the expansion service endpoint, but also because the Java versions are not guaranteed to be the same as the python versions (though I would like that to be the case). Even the expansion service argument could somehow be passed through the options, the two sets of xform classes may have different base classes: ideally the ones in io.external inherit from ExternalTransform (they currently do not, but they should be able to once we resolve the issue we're working on with @TheNeuralBit )

chamikaramj · 2019-10-24T17:46:22Z

sdks/python/apache_beam/io/external/gcp/pubsub.py

+)
+
+
+class ReadFromPubSub(beam.PTransform):


Should we rename to avoid conflicts with ReadFromPubSub/WriteToPubSub in io/gcp/pubsub.py ?

I think it's a feature that they are named the same. I can take a pipeline designed to run on dataflow, and simply change the imports to get it to run on flink.

chadrik · 2019-10-24T18:14:30Z

Would we actually need the PubsubMessage protobuf conversion methods for the external transform version of pubsubio/kafkaio

We would not need to use PubsubMessage protobuf type any more (and thus we wouldn't need the conversion methods), but we may still want the beam PubsubMessage class. The latter is yielded by io.pubsub.ReadFromPubSub, so if we ant io.external.pubsub.ReadFromPubSub to be a drop-in replacement (which I am in favor of) then that means keeping io.external.pubsub.PubsubMessage and adding the ability to convert between this and a row-type.

chadrik · 2019-10-24T18:24:38Z

@TheNeuralBit let's move the conversation about row coders over to https://jira.apache.org/jira/browse/BEAM-7870

This PR is ready to merge!

chamikaramj · 2019-10-25T17:58:19Z

LGTM. Thanks.

Looks like Max already approved so I'll merge.

chadrik · 2019-10-27T16:18:53Z

Whoohoo!

…

On Fri, Oct 25, 2019 at 11:00 AM Chamikara Jayalath < ***@***.***> wrote: Merged #9268 <#9268> into master. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#9268?email_source=notifications&email_token=AAAPOEYX42TXIM2QSI7AW73QQMX2BA5CNFSM4IJPKC42YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOUOOZZGQ#event-2745015450>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAPOE75YLYKHSGQWVCSMZTQQMX2BANCNFSM4IJPKC4Q> .

mxm · 2019-10-28T13:54:37Z

Congrats @chadrik! :) Also thanks for your help @chamikaramj and @TheNeuralBit.

chadrik commented Aug 5, 2019

View reviewed changes

chadrik force-pushed the external_gcp_pubsub branch from 45f3637 to 8f3ede3 Compare August 7, 2019 04:19

chadrik changed the title ~~[BEAM-7738] Add external transform support to PubSubIO~~ [BEAM-7738] Add external transform support to PubsubIO Aug 7, 2019

chadrik force-pushed the external_gcp_pubsub branch 2 times, most recently from 2b02422 to 6f41f12 Compare August 30, 2019 16:27

chadrik force-pushed the external_gcp_pubsub branch from 6f41f12 to fcbffb7 Compare September 26, 2019 23:56

chadrik force-pushed the external_gcp_pubsub branch from 9f32963 to 53af6bf Compare October 2, 2019 23:18

chadrik force-pushed the external_gcp_pubsub branch from 53af6bf to 774373a Compare October 8, 2019 04:21

[BEAM-7738] Add external transform support to PubsubIO

95bd4b2

Must use Boolean arg to make use of the BooleanCoder in Java

63cabe7

chadrik force-pushed the external_gcp_pubsub branch from 04f32a0 to 63cabe7 Compare October 18, 2019 06:17

Fix docs

b1e3fd1

mxm approved these changes Oct 22, 2019

View reviewed changes

sdks/python/apache_beam/io/external/gcp/pubsub.py Outdated Show resolved Hide resolved

sdks/python/apache_beam/io/external/gcp/pubsub.py Outdated Show resolved Hide resolved

Mark kafka and pubsub external transforms as experimental

7b3ce81

chamikaramj reviewed Oct 24, 2019

View reviewed changes

chamikaramj merged commit 73642eb into apache:master Oct 25, 2019

[BEAM-7738] Add external transform support to PubsubIO #9268

[BEAM-7738] Add external transform support to PubsubIO #9268

Uh oh!

Conversation

chadrik commented Aug 5, 2019

Post-Commit Tests Status (on master branch)

Pre-Commit Tests Status (on master branch)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chadrik commented Aug 5, 2019

Uh oh!

mxm commented Aug 13, 2019

Uh oh!

chadrik commented Aug 13, 2019

Uh oh!

chadrik commented Aug 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chadrik commented Aug 17, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chadrik commented Aug 30, 2019

Uh oh!

chadrik commented Sep 5, 2019

Uh oh!

chadrik commented Sep 5, 2019

Uh oh!

chadrik commented Sep 5, 2019

Uh oh!

chadrik commented Sep 10, 2019

Uh oh!

mxm commented Sep 24, 2019

Uh oh!

mxm commented Sep 24, 2019

Uh oh!

mxm commented Sep 24, 2019

Uh oh!

chadrik commented Sep 24, 2019

Uh oh!

mxm commented Sep 25, 2019

Uh oh!

chamikaramj commented Sep 26, 2019

Uh oh!

chamikaramj commented Sep 26, 2019

Uh oh!

chamikaramj commented Sep 26, 2019

Uh oh!

mxm commented Sep 27, 2019

Uh oh!

chadrik commented Oct 2, 2019

Uh oh!

chadrik commented Oct 9, 2019

Uh oh!

mxm commented Oct 16, 2019

Uh oh!

chadrik commented Oct 16, 2019 via email

Uh oh!

chadrik commented Oct 20, 2019

Uh oh!

chadrik commented Oct 21, 2019

Uh oh!

chadrik commented Oct 21, 2019

Uh oh!

chadrik commented Oct 21, 2019

Uh oh!

chadrik commented Oct 21, 2019

Uh oh!

mxm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

chadrik commented Oct 22, 2019

chadrik commented Aug 16, 2019 •

edited

Loading

chadrik commented Aug 17, 2019 •

edited

Loading