Skip to content

[native_datafusion] [Spark SQL Tests] _tmp_metadata_row_index not populated #3317

@andygrove

Description

@andygrove

Summary

2 Spark SQL tests fail because native_datafusion doesn't populate the _tmp_metadata_row_index metadata column.

Failing Tests

  • FileMetadataStructRowIndexSuite: "reading _tmp_metadata_row_index - not present in a table" — returns 0 instead of expected row indices
  • FileMetadataStructRowIndexSuite: "reading _tmp_metadata_row_index - present in a table" — returns 0 instead of expected row indices

Root Cause

native_datafusion scan doesn't generate row index metadata. The count returned is 0 instead of the expected 100 rows.

Possible Fix

In CometScanRule.nativeDataFusionScan(), detect when row index metadata columns are requested in the schema and fall back to native_iceberg_compat.

Related

Discovered in CI for #3307 (enable native_datafusion in auto scan mode).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions