Skip to content

Conversation

@bobbai00
Copy link
Contributor

This PR introduces the GUI of environment and some fixes to previous dataset features. For the backend of environment, see #2434

After introducing the environment, the way of uploading data and scanning data using workflow is presented in this blog. For more specific information, there is a demo video.

Features

  • View the Environment information at the workspace
    2024-03-21 23 02 09

  • Add dataset to the current environment
    2024-03-21 23 03 14

  • Preview Data File in Dataset of environment
    2024-03-21 23 04 08

  • Scan Files that are in the datasets
    2024-03-21 23 05 53

Implementation Details

The changes on the ScanSourceOperatorDesc

Previously, the source file is located by its absolute path and scanned into the workflow. Now, since all the files are within the dataset and managed by JGit, its physical file may not be directly available. Therefore, couple of changes are made regarding the way that source operator scans the file.

  1. In the source operator descriptor: ScanSourceOpDesc
    A new member variable is added:
  @JsonIgnore
  var filePath: Option[String] = None

// new
  @JsonIgnore
  var datasetFileDesc: Option[DatasetFileDesc] = None

class DatasetFileDesc contains the softlink to the file in the dataset, and has utilities to read the file as stream/tempraory file.

datasetFileDesc will be initialized when setContext is called:

    if (getContext.userId.isDefined) {
      val environmentEid = WorkflowResource.getEnvironmentEidOfWorkflow(
        UInteger.valueOf(workflowContext.workflowId.id)
      )
      // if user system is defined, a datasetFileDesc will be initialized, which is the handle of reading file from the dataset
      datasetFileDesc = Some(
        getEnvironmentDatasetFilePathAndVersion(getContext.userId.get, environmentEid, fileName.get)
      )
    }
  1. For each source operator executor, i.e. CSVScanSourceExec

A new parameter is added in the constructor:

class CSVScanSourceOpExec private[csv] (
    filePath: String,
    datasetFileDesc: DatasetFileDesc,

If datasetFileDesc is set non-null(i.e. user system is enabled), when creating the input stream reader, the stream will be created using datasetFileDesc.fileInputStream:

  // this function create the input stream accordingly:
  // - if filePath is set, create the stream from the file
  // - if fileDesc is set, create the stream via JGit call
  def createInputStream(filePath: String, fileDesc: DatasetFileDesc): InputStream = {
    if (filePath != null && fileDesc != null) {
      throw new RuntimeException(
        "File Path and Dataset File Descriptor cannot present at the same time."
      )
    }
    if (filePath != null) {
      new FileInputStream(filePath)
    } else {
      // create stream from dataset file desc
      fileDesc.fileInputStream()
    }
  }

@bobbai00 bobbai00 force-pushed the jiadong-introduce-environment-feat branch from 1a7c01b to 629af18 Compare March 27, 2024 23:10
@aglinxinyuan aglinxinyuan deleted the jiadong-introduce-environment-feat branch March 28, 2024 00:23
@aglinxinyuan aglinxinyuan restored the jiadong-introduce-environment-feat branch March 28, 2024 00:23
@aglinxinyuan aglinxinyuan reopened this Mar 28, 2024
Copy link
Contributor

@aglinxinyuan aglinxinyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The environment is not visible on my local test.

Copy link
Contributor

@aglinxinyuan aglinxinyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@bobbai00 bobbai00 merged commit acc831f into master Mar 28, 2024
@bobbai00 bobbai00 deleted the jiadong-introduce-environment-feat branch March 28, 2024 04:36
@Yicong-Huang Yicong-Huang added frontend Changes related to the frontend GUI and removed gui labels Jul 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend Changes related to the frontend GUI webserver

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants