-
Notifications
You must be signed in to change notification settings - Fork 395
Closed
Description
Currently, when we deserialize an Avro manifest, we get the PartitionSpec by deserializing partition-spec into a Vec<PartitionField>, then deserializing partition-spec-id into a string, and then build a PartitionSpec out of these two parts:
iceberg-rust/crates/iceberg/src/spec/manifest.rs
Lines 744 to 773 in 854171d
| let partition_spec = { | |
| let fields = { | |
| let bs = meta.get("partition-spec").ok_or_else(|| { | |
| Error::new( | |
| ErrorKind::DataInvalid, | |
| "partition-spec is required in manifest metadata but not found", | |
| ) | |
| })?; | |
| serde_json::from_slice::<Vec<PartitionField>>(bs).map_err(|err| { | |
| Error::new( | |
| ErrorKind::DataInvalid, | |
| "Fail to parse partition spec in manifest metadata", | |
| ) | |
| .with_source(err) | |
| })? | |
| }; | |
| let spec_id = meta | |
| .get("partition-spec-id") | |
| .map(|bs| { | |
| String::from_utf8_lossy(bs).parse().map_err(|err| { | |
| Error::new( | |
| ErrorKind::DataInvalid, | |
| "Fail to parse partition spec id in manifest metadata", | |
| ) | |
| .with_source(err) | |
| }) | |
| }) | |
| .transpose()? | |
| .unwrap_or(0); | |
| PartitionSpec { spec_id, fields } |
But the Iceberg spec expects partition-spec to be encoded as an object like this: https://iceberg.apache.org/spec/?h=avro#partition-specs
In reality we need to deserialize partition-spec directly into a PartitionSpec (
| pub struct PartitionSpec { |
Vec<PartitionField>.
I can submit a PR to fix. But, we might want to:
- First try to parse
partition-specinto aPartitionSpecas per the spec, - If that fails, revert to the current behaviour
How should we do this?
Fokko
Metadata
Metadata
Assignees
Labels
No labels