Skip to content

Conversation

@bobbai00
Copy link
Contributor

@bobbai00 bobbai00 commented Oct 25, 2024

This PR changes the logic of scan source operator resolving a file.

Specifically, scan source operator now no longer depends on userSystemEnabled flag to decide if it is scanning a local file or a file from dataset.

Instead, the resolving logic is:
Input: fileName (user friendly name provided by the user when setting scan source operators)

  • check if the file pointed by the fileName exist locally
    • If exists, resolve it as a local file
    • if not, check if the file exist in the dataset
      • if exists, resolve it as a DatasetFileDocument(the file handle of the dataset file)
      • if not, throw SourceFileNotFound error

@bobbai00 bobbai00 added the refactor Refactor the code label Oct 25, 2024
@bobbai00 bobbai00 requested a review from Yicong-Huang October 25, 2024 21:57
@bobbai00 bobbai00 self-assigned this Oct 25, 2024
@bobbai00 bobbai00 force-pushed the jiadong-cut-source-user-sys-dependency branch from 976c549 to 6c5d4d7 Compare October 26, 2024 15:17
@bobbai00 bobbai00 requested a review from Yicong-Huang October 26, 2024 15:57
@bobbai00 bobbai00 changed the title Add file:// prefix to file that is in the dataset to distinguish with local file Add FileResolver to resolve fileName given by users consistently without depending on userSystemEnabled flag Oct 26, 2024
@bobbai00 bobbai00 merged commit 2feceee into master Oct 26, 2024
@bobbai00 bobbai00 deleted the jiadong-cut-source-user-sys-dependency branch October 26, 2024 23:46
PurelyBlank pushed a commit that referenced this pull request Dec 4, 2024
…thout depending on userSystemEnabled flag (#2969)

This PR changes the logic of scan source operator resolving a file. 

Specifically, scan source operator now no longer depends on
`userSystemEnabled` flag to decide if it is scanning a local file or a
file from dataset.

Instead, the resolving logic is:
Input: fileName (user friendly name provided by the user when setting
scan source operators)
- check if the file pointed by the fileName exist locally
   - If exists, resolve it as a local file
   - if not, check if the file exist in the dataset
- if exists, resolve it as a DatasetFileDocument(the file handle of the
dataset file)
      - if not, throw `SourceFileNotFound` error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactor Refactor the code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants