Skip to content

Conversation

@yshcz
Copy link

@yshcz yshcz commented Dec 31, 2025

Closes #14926.

While implementing the spec, I looked into how other implementations detect the format version of manifest list files and found some notable differences in approaches:

Implementation Description
Java Uses Avro schema resolution without version-specific read path - always uses the same schema that includes all fields up to V3+
Python Similar to Java - uses schema resolution with a fixed read schema
Rust Uses current version of table metadata and branches on it, though a TODO comment notes this may not be needed
Go Reads format-version from Avro metadata and errors if missing, which is not strictly following the spec since this field is not required for manifest lists

This PR adds an implementation note to Appendix F to document the recommended approach and guide future implementations. It explains how Avro schema resolution enables implementations to use a single read schema without version detection, how to determine the exact format version when needed (by examining the writer schema), and why format-version in Avro metadata cannot be reliably used for manifest lists.

If manifest lists are removed in v4, this note would only apply to v1 to v3. I'm not sure how to frame this, any guidance from reviewers would be appreciated.

@github-actions github-actions bot added the Specification Issues that may introduce spec changes. label Dec 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Specification Issues that may introduce spec changes.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Spec: Add implementation note for determining manifest list format version

1 participant