Skip to content

Conversation

@neuyilan
Copy link
Member

@neuyilan neuyilan commented Dec 16, 2024

According to the previous discussion, I have written a proof of concept code. The main functions are as follows:

  1. Provide data-file.external-path to indicate the location of the newly written data. If this item is empty, the data is still written to the path specified by the warehouse as before.
  2. Add the dataRootLocation attribute in DataFileMeta to indicate the location of the data. If data-file.external-path is not empty, this value is the value of data-file.external-path, otherwise it is the warehouse path.
  3. Provide TablePathProvider, which can build the storage path of the table according to data-file.external-path or warehouse path.
  4. Provide HybridFileIO, which can create the corresponding FileIO according to the scheme class.

For more details, please see:
https://docs.google.com/document/d/1NhmOyxM16QmY_rVb3KJtCKRrU_nogIJv532U59qW7EI/edit?tab=t.0#heading=h.ebt67e56b0hw

@neuyilan neuyilan marked this pull request as draft December 16, 2024 11:35
@JingsongLi
Copy link
Contributor

JingsongLi commented Dec 20, 2024

Hi @neuyilan , can you create a PR for adding externalPath field in DataFileMeta only? I think it is better to have this field first.

@neuyilan
Copy link
Member Author

The relevant PRs are listed below:
#4751
#4761
#4766

@neuyilan neuyilan closed this Dec 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants