Skip to content

fix(python): crash when schema contains nested fixed_size_list or extension type#6107

Merged
Xuanwo merged 1 commit intolance-format:mainfrom
erandagan:fix/substrait-nested-fixed-size-list
Mar 6, 2026
Merged

fix(python): crash when schema contains nested fixed_size_list or extension type#6107
Xuanwo merged 1 commit intolance-format:mainfrom
erandagan:fix/substrait-nested-fixed-size-list

Conversation

@erandagan
Copy link
Copy Markdown
Contributor

@erandagan erandagan commented Mar 5, 2026

Closes #6106

AI Disclaimer: I have used Claude Code to help draft this PR and have manually reviewed its contents.

@github-actions github-actions Bot added bug Something isn't working python labels Mar 5, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 5, 2026

PR Review

Good bugfix — the recursive _needs_substrait_placeholder is a clean improvement over the previous flat is_fixed_size_list check, and the test covers the reported scenario well.

Two items to consider:

P1: Top-level field metadata not checked

_needs_substrait_placeholder checks f.metadata is not None for fields inside a struct, but the caller passes field.type (not the field itself):

for field in self.ds.schema:
    if _needs_substrait_placeholder(field.type):   # only inspects the DataType

If a top-level schema field itself has residual metadata={} (e.g. a top-level extension type column after a lance round-trip), this won't be caught. If substrait also rejects schema-level fields with non-None metadata, this would still crash. Worth verifying whether this is a real concern or if substrait only rejects metadata on struct sub-fields.

P1: Unrelated uv.lock churn

The lockfile diff is ~4300 lines of noise (requires-python changed from >=3.10 to >=3.9, revision bumped to 3, upload-time fields added everywhere). This appears to be an unrelated uv lock regeneration. Consider reverting the lockfile change and submitting it separately, or confirming this is intentional.

@erandagan erandagan force-pushed the fix/substrait-nested-fixed-size-list branch from 112b1cc to 72f40d6 Compare March 5, 2026 15:49
…nested types with placeholders

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@erandagan erandagan force-pushed the fix/substrait-nested-fixed-size-list branch from 72f40d6 to 48cf773 Compare March 5, 2026 15:50
Copy link
Copy Markdown
Collaborator

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this fix!

@Xuanwo Xuanwo merged commit 9314dc9 into lance-format:main Mar 6, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ScannerBuilder.filter crashes when schema contains extension types / fixed lists nested inside struct<...>

2 participants