-
Notifications
You must be signed in to change notification settings - Fork 3k
Docs: Document all metadata tables. #8709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@nk1506: I think it is hard to review. Can you please keep the scope of this PR to add the missing metadata tables? Rearranging can be done in a followup PR. |
docs/spark-queries.md
Outdated
| ``` | ||
|
|
||
| | status | snapshot_id | sequence_number | file_sequence_number | data_file | readable_metrics | | ||
| |--------| -- |-----------------|----------------------| -- | -- | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this style is not same as existing tables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment for all new tables added.
|
|
||
| ### Entries | ||
|
|
||
| To show all the table's current manifest entries for both data and delete files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this statement is confusing. "all the table + current" ?
docs/spark-queries.md
Outdated
|
|
||
| ### Positional Delete Files | ||
|
|
||
| To show all positional delete files from a table: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only from current snapshot? If it is not referenced by current snapshot, it won't show right? I think we need to clarify
docs/spark-queries.md
Outdated
|
|
||
| #### All Delete Files | ||
|
|
||
| To show all the table's delete files and each file's metadata: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment for "all the table's"
do we need each file's metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
followed the same convention of previous tables like files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. The existing convention is still confusing for me to read. But we can optimize in a follow up. Ok for me to keep it similar now.
docs/spark-queries.md
Outdated
|
|
||
| #### All Entries | ||
|
|
||
| To show all the table's manifest entries from any reachable snapshot for both data and delete files: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from any reachable snapshot -> from all the snapshots
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also " all the table's" is confusing here too.
|
@szehon-ho , Please review and share the feedback. |
ajantha-bhat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
added missing metadata tables
all_delete_filesall_entriesentriesposition_deletesFixes #757