From 8086bff7585646c5947e345f1bc61348a8b7acd2 Mon Sep 17 00:00:00 2001 From: Andy Le Date: Sun, 29 Mar 2020 21:39:29 +0700 Subject: [PATCH 1/6] AVRO-2785: Updated specs about how Unions are encoded --- doc/src/content/xdocs/spec.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/src/content/xdocs/spec.xml b/doc/src/content/xdocs/spec.xml index 947ae60854b..a8adf76cf22 100644 --- a/doc/src/content/xdocs/spec.xml +++ b/doc/src/content/xdocs/spec.xml @@ -546,7 +546,7 @@
Unions -

A union is encoded by first writing a long +

A union is encoded by first writing an int value indicating the zero-based position within the union of the schema of its value. The value is then encoded per the indicated schema within the union.

From d39f362339aa644082f349cde7586f08300129f0 Mon Sep 17 00:00:00 2001 From: Andy Le Date: Sun, 29 Mar 2020 21:40:14 +0700 Subject: [PATCH 2/6] AVRO-2785: Updated related docs for Perl & Python --- lang/perl/lib/Avro/BinaryDecoder.pm | 2 +- lang/perl/lib/Avro/BinaryEncoder.pm | 2 +- lang/py/avro/io.py | 4 ++-- lang/py3/avro/io.py | 4 ++-- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/lang/perl/lib/Avro/BinaryDecoder.pm b/lang/perl/lib/Avro/BinaryDecoder.pm index c5308f26970..fa12fcf0710 100644 --- a/lang/perl/lib/Avro/BinaryDecoder.pm +++ b/lang/perl/lib/Avro/BinaryDecoder.pm @@ -328,7 +328,7 @@ sub skip_union { $class->skip($union_schema, $reader); } -## 1.3.2 A union is encoded by first writing a long value indicating the +## 1.3.2 A union is encoded by first writing an int value indicating the ## zero-based position within the union of the schema of its value. The value ## is then encoded per the indicated schema within the union. sub decode_union { diff --git a/lang/perl/lib/Avro/BinaryEncoder.pm b/lang/perl/lib/Avro/BinaryEncoder.pm index f47abd10bd3..d476f4b4a2f 100644 --- a/lang/perl/lib/Avro/BinaryEncoder.pm +++ b/lang/perl/lib/Avro/BinaryEncoder.pm @@ -234,7 +234,7 @@ sub encode_map { $class->encode_long(undef, 0, $cb); } -## 1.3.2 A union is encoded by first writing a long value indicating the +## 1.3.2 A union is encoded by first writing an int value indicating the ## zero-based position within the union of the schema of its value. The value ## is then encoded per the indicated schema within the union. sub encode_union { diff --git a/lang/py/avro/io.py b/lang/py/avro/io.py index 52b631aaa5f..66474a56732 100644 --- a/lang/py/avro/io.py +++ b/lang/py/avro/io.py @@ -882,7 +882,7 @@ def skip_map(self, writers_schema, decoder): def read_union(self, writers_schema, readers_schema, decoder): """ - A union is encoded by first writing a long value indicating + A union is encoded by first writing an int value indicating the zero-based position within the union of the schema of its value. The value is then encoded per the indicated schema within the union. """ @@ -1129,7 +1129,7 @@ def write_map(self, writers_schema, datum, encoder): def write_union(self, writers_schema, datum, encoder): """ - A union is encoded by first writing a long value indicating + A union is encoded by first writing an int value indicating the zero-based position within the union of the schema of its value. The value is then encoded per the indicated schema within the union. """ diff --git a/lang/py3/avro/io.py b/lang/py3/avro/io.py index 51f5a13b013..31623a9e0ad 100644 --- a/lang/py3/avro/io.py +++ b/lang/py3/avro/io.py @@ -636,7 +636,7 @@ def skip_map(self, writer_schema, decoder): def read_union(self, writer_schema, reader_schema, decoder): """ - A union is encoded by first writing a long value indicating + A union is encoded by first writing an int value indicating the zero-based position within the union of the schema of its value. The value is then encoded per the indicated schema within the union. """ @@ -866,7 +866,7 @@ def write_map(self, writer_schema, datum, encoder): def write_union(self, writer_schema, datum, encoder): """ - A union is encoded by first writing a long value indicating + A union is encoded by first writing an int value indicating the zero-based position within the union of the schema of its value. The value is then encoded per the indicated schema within the union. """ From 1fd85bde892e3461054cbc8dcb128b025a85b06d Mon Sep 17 00:00:00 2001 From: Andy Le Date: Tue, 5 May 2020 07:00:38 +0700 Subject: [PATCH 3/6] AVRO-2785: add side notes for enum_encoding Thank you @kojiromike --- doc/src/content/xdocs/spec.xml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/doc/src/content/xdocs/spec.xml b/doc/src/content/xdocs/spec.xml index a8adf76cf22..372fa61db6f 100644 --- a/doc/src/content/xdocs/spec.xml +++ b/doc/src/content/xdocs/spec.xml @@ -491,6 +491,8 @@

This would be encoded by an int between zero and three, with zero indicating "A", and 3 indicating "D".

+

NOTE: Currently for C/C++ implementtions, the positions are practically an int, but theoretically a long. + In reality, we don't expect unions with 215M members

From af5196462d5770445cec4d907908fbbdcd7a91ce Mon Sep 17 00:00:00 2001 From: Andy Le Date: Tue, 5 May 2020 07:00:38 +0700 Subject: [PATCH 4/6] AVRO-2785: add side notes for union_encoding Thank you @kojiromike --- doc/src/content/xdocs/spec.xml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/doc/src/content/xdocs/spec.xml b/doc/src/content/xdocs/spec.xml index a8adf76cf22..372fa61db6f 100644 --- a/doc/src/content/xdocs/spec.xml +++ b/doc/src/content/xdocs/spec.xml @@ -491,6 +491,8 @@

This would be encoded by an int between zero and three, with zero indicating "A", and 3 indicating "D".

+

NOTE: Currently for C/C++ implementtions, the positions are practically an int, but theoretically a long. + In reality, we don't expect unions with 215M members

From dd8d5cf050835af7f6c0fe4909dda594e1a38dbe Mon Sep 17 00:00:00 2001 From: Andy Le Date: Tue, 5 May 2020 07:04:43 +0700 Subject: [PATCH 5/6] AVRO-2785: update side notes for union_encoding --- doc/src/content/xdocs/spec.xml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/src/content/xdocs/spec.xml b/doc/src/content/xdocs/spec.xml index 372fa61db6f..feb8e2bc0ee 100644 --- a/doc/src/content/xdocs/spec.xml +++ b/doc/src/content/xdocs/spec.xml @@ -491,8 +491,6 @@

This would be encoded by an int between zero and three, with zero indicating "A", and 3 indicating "D".

-

NOTE: Currently for C/C++ implementtions, the positions are practically an int, but theoretically a long. - In reality, we don't expect unions with 215M members

@@ -562,6 +560,8 @@ followed by the serialized string: 02 02 61 +

NOTE: Currently for C/C++ implementtions, the positions are practically an int, but theoretically a long. + In reality, we don't expect unions with 215M members

From 8788e6f9af3517e99d896dd4ec23cdbeb76f8105 Mon Sep 17 00:00:00 2001 From: Andy Le Date: Tue, 5 May 2020 07:07:31 +0700 Subject: [PATCH 6/6] AVRO-2785: remove invalid side note on enum_encoding --- doc/src/content/xdocs/spec.xml | 2 -- 1 file changed, 2 deletions(-) diff --git a/doc/src/content/xdocs/spec.xml b/doc/src/content/xdocs/spec.xml index 099c9c1a25f..feb8e2bc0ee 100644 --- a/doc/src/content/xdocs/spec.xml +++ b/doc/src/content/xdocs/spec.xml @@ -491,8 +491,6 @@

This would be encoded by an int between zero and three, with zero indicating "A", and 3 indicating "D".

-

NOTE: Currently for C/C++ implementtions, the positions are practically an int, but theoretically a long. - In reality, we don't expect unions with 215M members