diff --git a/format/Layout.md b/format/Layout.md index c393163bf89..2d46ece606e 100644 --- a/format/Layout.md +++ b/format/Layout.md @@ -42,7 +42,7 @@ Base requirements * Capable of representing fully-materialized and decoded / decompressed Parquet data * All leaf nodes (primitive value arrays) use contiguous memory regions -* Each relative type can be nullable or non-nullable +* Any relative type can be have null slots * Arrays are immutable once created. Implementations can provide APIs to mutate an array, but applying mutations will require a new array data structure to be built. @@ -56,7 +56,7 @@ Base requirements * To describe relative types (physical value types and a preliminary set of nested types) sufficient for an unambiguous implementation * Memory layout and random access patterns for each relative type -* Null representation for nullable types +* Null value representation ## Non-goals (for this document @@ -79,28 +79,55 @@ Base requirements Any array has a known and fixed length, stored as a 32-bit signed integer, so a maximum of 2^31 - 1 elements. We choose a signed int32 for a couple reasons: -* Enhance compatibility with Java and client languages which may have varying quality of support for unsigned integers. +* Enhance compatibility with Java and client languages which may have varying + quality of support for unsigned integers. * To encourage developers to compose smaller arrays (each of which contains contiguous memory in its leaf nodes) to create larger array structures possibly exceeding 2^31 - 1 elements, as opposed to allocating very large contiguous memory blocks. -## Nullable and non-nullable arrays +## Null count -Any relative type can be nullable or non-nullable. +The number of null value slots is a property of the physical array and +considered part of the data structure. The null count is stored as a 32-bit +signed integer, as it may be as large as the array length. -Nullable arrays have a contiguous memory buffer, known as the null bitmask, -whose length is large enough to have 1 bit for each array slot. Whether any -array slot is null is encoded in the respective bits of this bitmask, i.e.: +## Null bitmaps + +Any relative type can have null value slots, whether primitive or nested type. + +An array with nulls must have a contiguous memory buffer, known as the null (or +validity) bitmap, whose length is a multiple of 8 bytes (to avoid +word-alignment concerns) and large enough to have at least 1 bit for each array +slot. + +Whether any array slot is valid (non-null) is encoded in the respective bits of +this bitmap. A 1 (set bit) for index `j` indicates that the value is not null, +while a 0 (bit not set) indicates that it is null. Bitmaps are to be +initialized to be all unset at allocation time. ``` -is_null[j] -> bitmask[j / 8] & (1 << (j % 8)) +is_valid[j] -> bitmap[j / 8] & (1 << (j % 8)) ``` -Physically, non-nullable (NN) arrays do not have a null bitmask. +We use [least-significant bit (LSB) numbering][1] (also known as +bit-endianness). This means that within a group of 8 bits, we read +right-to-left: -For nested types, if the top-level nested type is nullable, it has its own -bitmask regardless of whether the child types are nullable. +``` +values = [0, 1, null, 2, null, 3] + +bitmap +j mod 8 7 6 5 4 3 2 1 0 + 0 0 1 0 1 0 1 1 +``` + +Arrays having a 0 null count may choose to not allocate the null +bitmap. Implementations may choose to always allocate one anyway as a matter of +convenience, but this should be noted when memory is being shared. + +Nested type arrays have their own null bitmap and null count regardless of +the null count and null bits of their child arrays. ## Primitive value arrays @@ -112,9 +139,8 @@ Internally, the array contains a contiguous memory buffer whose total size is equal to the slot width multiplied by the array length. For bit-packed types, the size is rounded up to the nearest byte. -The associated null bitmask (for nullable types) is contiguously allocated (as -described above) but does not need to be adjacent in memory to the values -buffer. +The associated null bitmap is contiguously allocated (as described above) but +does not need to be adjacent in memory to the values buffer. (diagram not to scale) @@ -180,22 +206,22 @@ For example, the struct (field names shown here as strings for illustration purposes) ``` -Struct [nullable] < - name: String (= List) [nullable], - age: Int32 [not-nullable] +Struct < + name: String (= List), + age: Int32 > ``` -has two child arrays, one List array (layout as above) and one -non-nullable 4-byte physical value array having Int32 (not-null) logical -type. Here is a diagram showing the full physical layout of this struct: +has two child arrays, one List array (layout as above) and one 4-byte +physical value array having Int32 logical type. Here is a diagram showing the +full physical layout of this struct: While a struct does not have physical storage for each of its semantic slots (i.e. each scalar C-like struct), an entire struct slot can be set to null via -the bitmask. Whether each of the child field arrays can have null values -depends on whether or not the respective relative type is nullable. +the null bitmap. Any of the child field arrays can have null values according +to their respective independent null bitmaps. ## Dense union type @@ -233,8 +259,7 @@ Here is a diagram of an example dense union: A sparse union has the same structure as a dense union, with the omission of the offsets array. In this case, the child arrays are each equal in length to -the length of the union. This is analogous to a large struct in which all -fields are nullable. +the length of the union. While a sparse union may use significantly more space compared with a dense union, it has some advantages that may be desirable in certain use cases: @@ -251,3 +276,5 @@ the correct value. ## References Drill docs https://drill.apache.org/docs/value-vectors/ + +[1]: https://en.wikipedia.org/wiki/Bit_numbering \ No newline at end of file diff --git a/format/Message.fbs b/format/Message.fbs index 3ffd2033208..fc849eedf79 100644 --- a/format/Message.fbs +++ b/format/Message.fbs @@ -129,8 +129,8 @@ struct FieldNode { length: int; /// The number of observed nulls. Fields with null_count == 0 may choose not - /// to write their physical null bitmap out as a materialized buffer, instead - /// setting the length of the null buffer to 0. + /// to write their physical validity bitmap out as a materialized buffer, + /// instead setting the length of the bitmap buffer to 0. null_count: int; } @@ -148,9 +148,9 @@ table RecordBatch { /// Buffers correspond to the pre-ordered flattened buffer tree /// /// The number of buffers appended to this list depends on the schema. For - /// example, most primitive arrays will have 2 buffers, 1 for the null bitmap - /// and 1 for the values. For struct arrays, there will only be a single - /// buffer for the null bitmap + /// example, most primitive arrays will have 2 buffers, 1 for the validity + /// bitmap and 1 for the values. For struct arrays, there will only be a + /// single buffer for the validity (nulls) bitmap buffers: [Buffer]; } diff --git a/format/README.md b/format/README.md index 1120e6282a5..c84e00772c3 100644 --- a/format/README.md +++ b/format/README.md @@ -3,3 +3,20 @@ > **Work-in-progress specification documents**. These are discussion documents > created by the Arrow developers during late 2015 and in no way represents a > finalized specification. + +Currently, the Arrow specification consists of these pieces: + +- Physical memory layout specification (see Layout.md) +- Metadata serialized representation (see Message.fbs) + +The metadata currently uses Google's [flatbuffers library][1] for serializing a +couple related pieces of information: + +- Schemas for tables or record (row) batches. This contains the logical types, + field names, and other metadata. Schemas do not contain any information about + actual data. +- *Data headers* for record (row) batches. These must correspond to a known + schema, and enable a system to send and receive Arrow row batches in a form + that can be precisely disassembled or reconstructed. + +[1]: http://github.com/google/flatbuffers \ No newline at end of file diff --git a/format/diagrams/layout-list-of-struct.png b/format/diagrams/layout-list-of-struct.png index 00d6c6fa441..fb6f2a27e07 100644 Binary files a/format/diagrams/layout-list-of-struct.png and b/format/diagrams/layout-list-of-struct.png differ