Check `field_is_present` always passes for CSV/Parquet files

For CSV and Parquet files that are obtained over DuckDB (S3, GCS, Azure and local environment), the check `field_is_present` always passes, independent of whether the column is present in the CSV or not.

**Cause:** In `create_view_with_schema_union(...)`, we create an empty table based on the datacontract and later insert the actual data, e.g. from a CSV file. If a column is not present in the CSV file, it will still be in the checked table (but filled with NULLs).

**Possible solutions:**

1. Remove the `field_is_present` check (for non-required fields). However, it is useful to check for field presence as discussed in #1018.
2. Fix the check and enforce field presence in the header even for non-required fields. This would break, e.g., `test_csv_optional_field_missing_from_old_data()` in `test_test_schema_evolution.py`, which assumes that a CSV-file that misses a non-required field should pass all tests.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check `field_is_present` always passes for CSV/Parquet files #1065

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Check field_is_present always passes for CSV/Parquet files #1065

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Check `field_is_present` always passes for CSV/Parquet files #1065