Skip to content

[Variant] Variant::Object can contain two fields with the same field name #7730

@friendlymatthew

Description

@friendlymatthew

Describe the bug
Today, users are able to append multiple fields with the same field name to an object.

Per the Parquet Variant spec:

Field names are case-sensitive. Field names are required to be unique for each object. It is an error for an object to contain two fields with the same name, whether or not they have distinct dictionary IDs.

To Reproduce

#[test]
fn test_object_duplicate_field_names() {
    let mut builder = VariantBuilder::new();

    {
        let mut obj = builder.new_object();
        obj.append_value("name", "John");
        obj.append_value("name", "Alice");
        obj.finish();
    }

    let (metadata, value) = builder.finish();
    assert!(!metadata.is_empty());
    assert!(!value.is_empty());

    let Variant::Object(obj_variant) = Variant::try_new(&metadata, &value).unwrap() else {
        panic!()
    };

    /*
    [parquet-variant/src/builder.rs:760:9] obj_variant.iter().collect::<Vec<_>>() = [
        (
            "name",
            ShortString(
                ShortString(
                    "John",
                ),
            ),
        ),
        (
            "name",
            ShortString(
                ShortString(
                    "Alice",
                ),
            ),
        ),
    ]
    */
    dbg!(obj_variant.iter().collect::<Vec<_>>());
}

Expected behavior
It should raise an error on the second obj.append_value call

Metadata

Metadata

Labels

bugparquetChanges to the parquet crate

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions