From 5474730987147f7a309d50ffe4a4331830d7f60c Mon Sep 17 00:00:00 2001 From: Andrew Lamb Date: Sun, 8 Jun 2025 13:15:05 -0400 Subject: [PATCH] Correct `primitive_null.value` --- variant/README.md | 9 ++++++++- variant/primitive_null.value | Bin 0 -> 1 bytes 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/variant/README.md b/variant/README.md index e335caf..418dfa7 100644 --- a/variant/README.md +++ b/variant/README.md @@ -49,7 +49,7 @@ The files in this directory were initially generated by running the [`regen.py`] script which used Apache Spark to generate the files. The files have been subsequently modified when necessary to ensure that they conform to the Parquet spec. -### Modification 1: Created metadata for `primitive_null` as a single byte (`0x01`) +### Modification 1: Created metadata and value for `primitive_null` as a single byte (`0x01`) Per , Spark did not generate any metadata for `null` and left `primitive_null.metadata` empty. @@ -62,5 +62,12 @@ The metadata for `primitive_null` should be the same 3 bytes as other primitive cp primitive_int8.metadata primitive_null.metadata ``` +The value for a primitive should be a `value_header` and no `value_data`, +resulting in a single `0` byte: + +```shell +echo -n 'a' | tr a '\0' > primitive_null.value +``` + [Variant]: https://github.com/apache/parquet-format/blob/master/VariantEncoding.md [primitive types listed in the spec]: https://github.com/apache/parquet-format/blob/master/VariantEncoding.md#value-data-for-primitive-type-basic_type0 diff --git a/variant/primitive_null.value b/variant/primitive_null.value index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..f76dd238ade08917e6712764a16a22005a50573d 100644 GIT binary patch literal 1 IcmZPo000310RR91 literal 0 HcmV?d00001