Add more docs and examples for ListArray and OffsetsBuffer#4607
Add more docs and examples for ListArray and OffsetsBuffer#4607tustvold merged 17 commits intoapache:masterfrom
Conversation
izveigor
left a comment
There was a problem hiding this comment.
LGTM! I think it is essential information for new arrow-rs users.
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
…o alamb/list_array_docs
|
I'll merge this later in the week if there are no more comments cc @jayzhan211 @smiklos and @vincev who may also be interested |
|
Would it make sense to also document |
FixedSizeListArray stores value the same as ListArray but each row has the same size of array. |
This isn't entirely accurate, there is no set of offsets which means that all slots have the same width. In particular this includes nulls, which can cause some interesting challenges. |
arrow-array/src/array/list_array.rs
Outdated
| /// ┌─────────────┐ | ||
| /// │ [A,B,C] │ | ||
| /// ├─────────────┤ | ||
| /// │ [] (empty) │ |
There was a problem hiding this comment.
This is a little confusing to me.
(empty) is just the explanation to []? Showing that [] is empty array?
There was a problem hiding this comment.
I think just leaving [] might be better. Maybe an explanation in the comment!
There was a problem hiding this comment.
Good idea -- I will make that change.
Edit: Done in 8e6a56d
Based on the following array GenericListArray::values will look like this But FixedSizeArray::values would look like this |
Yes it absolutely would make sense. If you have time that would be great. Otherwise I'll try and make a diagram tomorrow |
arrow-array/src/array/list_array.rs
Outdated
| /// let mut builder = ListBuilder::new(values_builder); | ||
| /// | ||
| /// // [A, B, C] | ||
| /// builder.values().append_value("A"); |
There was a problem hiding this comment.
I was thinking that perhaps the example doc of using the builder pattern should go to GenericListBuilder and here we could use the from or try_new_from_array_data to construct the array directly from values array and nulls.
FixedSizeListArray ist documented like this, so I'm in between adding builder based code there or keep the two separated. We could just link to the builder based approach from Generic/Fixed list array
There was a problem hiding this comment.
I was torn on this, as keeping the example (that constructs the data in the diagram) I thought added value. However I think your idea of adding a link to the approach would be good.
I implemented this change in 151564d
I added this pr, largely based on the art in this pr. |
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Which issue does this PR close?
N/A
Rationale for this change
I personally find
ListArrayconfusing both for myself and to explain to others. Recently I made some pictures while reviewing and discussing with @izveigor in DataFusion apache/datafusion#6936 (comment) and want to add them to the docs so I can keep referring to themI also think it would be nice to be able to use ListArray without having to refer to the Arrow spec
What changes are included in this PR?
Add more docs and examples for ListArray and OffsetsBuffer
With these changes I think someone has a plausible chance of understanding how to use
ListArraywithout having to refer to the Arrow specAre there any user-facing changes?
Better docs
No API changes