native_datafusion: case-insensitive mode doesn't detect duplicate/ambiguous Parquet fields

## Description

When running Spark SQL tests with `native_datafusion` scan, tests that expect errors for duplicate or ambiguous fields in case-insensitive mode fail because DataFusion's Parquet reader doesn't enforce Spark's case-sensitivity validation rules.

## Affected Tests

### `Spark native readers should respect spark.sql.caseSensitive` (FileBasedDataSourceSuite)

Writes a Parquet file with columns `A`, `b`, `B`, then reads with `caseSensitive=false`. Spark expects a `SparkException` when selecting `b` (ambiguous between `b` and `B`), but `native_datafusion` reads without error.

### `SPARK-25207: exception when duplicate fields in case-insensitive mode` (ParquetFilterSuite, V1 and V2)

Writes Parquet with columns `A`, `B`, `b`, then reads with `caseSensitive=false`. Spark expects a `SparkException` with cause `RuntimeException` containing `Found duplicate field(s) "B": [B, b]`. The native reader either doesn't detect the duplicate, or wraps the error with a different exception type/cause than expected.

## Context

PR #3687 added a fallback from `native_datafusion` for duplicate fields in case-insensitive mode, avoiding the test failures by falling back to the Spark reader. These tests remain ignored because the native reader itself doesn't implement the validation.

## Related

- #3687: fall back from native_datafusion for duplicate fields in case-insensitive mode
- Split from #3311

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

native_datafusion: case-insensitive mode doesn't detect duplicate/ambiguous Parquet fields #3760

Description

Affected Tests

`Spark native readers should respect spark.sql.caseSensitive` (FileBasedDataSourceSuite)

`SPARK-25207: exception when duplicate fields in case-insensitive mode` (ParquetFilterSuite, V1 and V2)

Context

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

native_datafusion: case-insensitive mode doesn't detect duplicate/ambiguous Parquet fields #3760

Description

Description

Affected Tests

Spark native readers should respect spark.sql.caseSensitive (FileBasedDataSourceSuite)

SPARK-25207: exception when duplicate fields in case-insensitive mode (ParquetFilterSuite, V1 and V2)

Context

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`Spark native readers should respect spark.sql.caseSensitive` (FileBasedDataSourceSuite)

`SPARK-25207: exception when duplicate fields in case-insensitive mode` (ParquetFilterSuite, V1 and V2)