Skip to content

Fuzzing Crash: Registry missing encoding 'vortex.zstd' during file deserialization #5614

@github-actions

Description

@github-actions

Fuzzing Crash Report

Analysis

Crash Location: vortex-array/src/context.rs:try_from_registry

Error Message:

Registry missing encoding with id vortex.zstd

Stack Trace:

#16 vortex_error::VortexUnwrap::vortex_unwrap /home/runner/_work/vortex/vortex/vortex-error/src/lib.rs:328:14
#17 file_io::_::__libfuzzer_sys_run /home/runner/_work/vortex/vortex/fuzz/fuzz_targets/file_io.rs:84:10
#18 rust_fuzzer_test_input
#19 libfuzzer_sys::test_input_wrap
#20 LLVMFuzzerTestOneInput

Root Cause:

The fuzzer generated a Vortex file that contains an encoding reference to vortex.zstd, but when the file is deserialized and scanned, the encoding registry used by the SESSION doesn't have the zstd encoding registered. This happens at file_io.rs:84 when calling .scan().vortex_unwrap().

The issue is in vortex-array/src/context.rs:41 where VTableContext::try_from_registry fails to find the encoding ID in the registry and returns an error. The fuzzer appears to be writing files with encodings that may not be available in the default session's registry when reading back.

Debug Output
FuzzFileAction {
    array: ChunkedArray {
        dtype: Utf8(
            NonNullable,
        ),
        len: 1,
        chunk_offsets: PrimitiveArray {
            dtype: Primitive(
                U64,
                NonNullable,
            ),
            buffer: Buffer<u8> {
                length: 24,
                alignment: Alignment(
                    8,
                ),
                as_slice: [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, ...],
            },
            validity: NonNullable,
            stats_set: ArrayStats {
                inner: RwLock {
                    data: StatsSet {
                        values: [],
                    },
                },
            },
        },
        chunks: [
            VarBinArray {
                dtype: Utf8(
                    NonNullable,
                ),
                bytes: Buffer<u8> {
                    length: 0,
                    alignment: Alignment(
                        1,
                    ),
                    as_slice: [],
                },
                offsets: PrimitiveArray {
                    dtype: Primitive(
                        U32,
                        NonNullable,
                    ),
                    buffer: Buffer<u8> {
                        length: 8,
                        alignment: Alignment(
                            4,
                        ),
                        as_slice: [0, 0, 0, 0, 0, 0, 0, 0],
                    },
                    validity: NonNullable,
                    stats_set: ArrayStats {
                        inner: RwLock {
                            data: StatsSet {
                                values: [
                                    (
                                        IsSorted,
                                        Exact(
                                            ScalarValue(
                                                Bool(
                                                    true,
                                                ),
                                            ),
                                        ),
                                    ),
                                ],
                            },
                        },
                    },
                },
                validity: NonNullable,
                stats_set: ArrayStats {
                    inner: RwLock {
                        data: StatsSet {
                            values: [],
                        },
                    },
                },
            },
            VarBinArray {
                dtype: Utf8(
                    NonNullable,
                ),
                bytes: Buffer<u8> {
                    length: 0,
                    alignment: Alignment(
                        1,
                    ),
                    as_slice: [],
                },
                offsets: PrimitiveArray {
                    dtype: Primitive(
                        U32,
                        NonNullable,
                    ),
                    buffer: Buffer<u8> {
                        length: 4,
                        alignment: Alignment(
                            4,
                        ),
                        as_slice: [0, 0, 0, 0],
                    },
                    validity: NonNullable,
                    stats_set: ArrayStats {
                        inner: RwLock {
                            data: StatsSet {
                                values: [
                                    (
                                        IsSorted,
                                        Exact(
                                            ScalarValue(
                                                Bool(
                                                    true,
                                                ),
                                            ),
                                        ),
                                    ),
                                ],
                            },
                        },
                    },
                },
                validity: NonNullable,
                stats_set: ArrayStats {
                    inner: RwLock {
                        data: StatsSet {
                            values: [],
                        },
                    },
                },
            },
        ],
        stats_set: ArrayStats {
            inner: RwLock {
                data: StatsSet {
                    values: [],
                },
            },
        },
    },
    projection_expr: None,
    filter_expr: None,
    compressor_strategy: Compact,
}

Summary

Reproduction

  1. Download the crash artifact:

  2. Reproduce locally:

# The artifact contains file_io/crash-2265912211260523fbed4825c2a8583ef884a4c7
cargo +nightly fuzz run --sanitizer=none file_io file_io/crash-2265912211260523fbed4825c2a8583ef884a4c7
  1. Get full backtrace:
RUST_BACKTRACE=full cargo +nightly fuzz run --sanitizer=none file_io file_io/crash-2265912211260523fbed4825c2a8583ef884a4c7

Auto-created by fuzzing workflow with Claude analysis

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions