-
Notifications
You must be signed in to change notification settings - Fork 3k
Closed
Description
In order to avoid reading manifests to get stats in #675, we need to collect additional metadata when writing manifests. The minimum required information (num added/deleted records) can be gathered easily. The main question is how to store it.
Option 1
We can store the new metadata as we store the number of added/deleted/existing files.
Benefits:
- We can always retrieve the additional information by reading manifests.
Drawbacks:
- Affects what we actually store on disk.
- Potentially increases the metadata size.
Option 2
We can introduce a serializable field of type Map<String, String> but don't store that on disk. Only instances of ManifestFile created via ManifestWriter will always contain all needed properties.
Benefits:
- The metadata on disk doesn't change.
- The metadata size stays the same.
Drawbacks
- We cannot retrieve the additional information by reading manifests.
Metadata
Metadata
Assignees
Labels
No labels