diff --git a/variant/README.md b/variant/README.md index e335caf..418dfa7 100644 --- a/variant/README.md +++ b/variant/README.md @@ -49,7 +49,7 @@ The files in this directory were initially generated by running the [`regen.py`] script which used Apache Spark to generate the files. The files have been subsequently modified when necessary to ensure that they conform to the Parquet spec. -### Modification 1: Created metadata for `primitive_null` as a single byte (`0x01`) +### Modification 1: Created metadata and value for `primitive_null` as a single byte (`0x01`) Per , Spark did not generate any metadata for `null` and left `primitive_null.metadata` empty. @@ -62,5 +62,12 @@ The metadata for `primitive_null` should be the same 3 bytes as other primitive cp primitive_int8.metadata primitive_null.metadata ``` +The value for a primitive should be a `value_header` and no `value_data`, +resulting in a single `0` byte: + +```shell +echo -n 'a' | tr a '\0' > primitive_null.value +``` + [Variant]: https://github.com/apache/parquet-format/blob/master/VariantEncoding.md [primitive types listed in the spec]: https://github.com/apache/parquet-format/blob/master/VariantEncoding.md#value-data-for-primitive-type-basic_type0 diff --git a/variant/primitive_null.value b/variant/primitive_null.value index e69de29..f76dd23 100644 Binary files a/variant/primitive_null.value and b/variant/primitive_null.value differ