Conversation
datafusion/src/scalar.rs
Outdated
| IntervalYearMonth(Option<i32>), | ||
| /// Interval with DayTime unit | ||
| IntervalDayTime(Option<i64>), | ||
|
|
There was a problem hiding this comment.
nitpick. Delete the blank line to keep the same as above.
alamb
left a comment
There was a problem hiding this comment.
Thank you @jonmmease -- this looks great and I am excited to extend DataFusion's type system support to include better handling of structs. @Igosuki and @houqp have been working to add the ability to select fields in https://github.com/apache/arrow-datafusion/pull/1006/files and it seems to me that this code sets the foundation for field access to struct types as well.
There appears to be a small rustfmt issue preventing CI from running on this PR: https://github.com/apache/arrow-datafusion/pull/1091/checks?check_run_id=3850128421
(I think it can be resolved by running cargo fmt and checking in the result)
Otherwise, if you could just add a test for a nested struct I think this PR is ready to go
datafusion/src/scalar.rs
Outdated
| let mut fields: Vec<Field> = Vec::new(); | ||
| let mut scalars: Vec<ScalarValue> = Vec::new(); | ||
|
|
||
| value.iter().for_each(|(name, scalar)| { | ||
| fields.push(Field::new(name, scalar.get_datatype(), false)); | ||
| scalars.push(scalar.clone()); | ||
| }); |
There was a problem hiding this comment.
You might be able to write this in a more functional style (and avoid mut) using unzip: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.unzip
Something like (untested)
let (fields, scalars) = value.iter()
.for_each(|(name, scalar)| {
(Field::new(name, scalar.get_datatype(), false),
scalar.clone())
})
.unzip();
There was a problem hiding this comment.
to be clear I don't think this change is needed for this PR, I just wanted to point it out -- it took me quite a while to get comfortable with the functional parts of rust -- at first I thought it was just style preference, but then the longer I work in rust I realize it is one of the key mechanisms rust uses to eliminate bounds checks during iteration
There was a problem hiding this comment.
Ahh, I hadn't realized Rust had an unzip. Thanks for the suggestion. Done in 459c029
| )), | ||
| None | ||
| ); | ||
|
|
There was a problem hiding this comment.
These are very nice and thorough tests.
The only thing I didn't see a test for was struct recursion -- like
A : {
B: "foo"
C: "bar"
}
I wonder if you could extend the A, B, C example below to include a nested struct as well?
|
Thanks for the feedback @alamb and @xudong963! I'll do a round of updates tomorrow |
Yeah, that took care of it. Thanks! |
alamb
left a comment
There was a problem hiding this comment.
Looks great @jonmmease -- thank you! I just kicked off the CI checks and if those succeed I'll plan to merge this in
|
Looks great -- thanks again for the contribution @jonmmease ! |
Which issue does this PR close?
Closes #602
Rationale for this change
DataFusion already works well, in many cases, with Arrow
StructArraycolumns. This PR adds a correspondingScalarValuevariant that can be used to represent a single element of aStructArray.What changes are included in this PR?
This is currently a fairly minimal PR that adds a new
ScalarValue::Structvariant, largely following the existing pattern for theScalarValue::Listvariant. TheStructvariant should support the same feature set asScalarValue::List.Let me know if there are uses of
ScalarValuethroughout the rest of DataFusion that should be updated to include struct support before these changes are merged. Thanks!Are there any user-facing changes?
Yes, a new
ScalarValueenumeration variant