Read in the Iceberg metadata

Every query in Iceberg starts with the metadata. This is the JSON file that's created at each commit on an Iceberg table. 

There are two versions (number three is underway):

1. Describes [Iceberg tables](https://iceberg.apache.org/spec/#version-1-analytic-data-tables)
2. Everything from version 1, with support for merge-on-read deletes.

What I would suggest is reading both V1 and V2 and merging them into a common structure in memory. This includes merging some fields:

- `schemas` is optional in V1, and `schema` is removed in V2. For V1 only the current schema was kept, but for V2 all the historical schemas are preserved as well. When reading a V1 table, the schema from `schema` would be added to `schemas`, and it would set the `current-schema-id` to the newly added schema.
- Same applies to `partition-specs`
- When we read a V1 table, we'll add a `main` ref to the `refs` dict, pointing to the current snapshot.

There are also example manifests available from the Java repository: https://github.com/apache/iceberg/tree/master/core/src/test/resources

Ps. on a tangent, but related, I'm also thinking of [creating a jsonschema](https://github.com/apache/iceberg/issues/8266), would that be helpful for rust?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read in the Iceberg metadata #28

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Read in the Iceberg metadata #28

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions