Skip to content

1 | 1.1.1 | Minimum Viable Product (MVP) for registering metadata in the repository and connecting the metadata to the data in the research computing remote storage (NESE), including Globus endpoints | 15 #13

@sync-by-unito

Description

@sync-by-unito

References:

Problem Statement

Prior to this work, Dataverse is capable of storing up to about 1TB in S3.

Proposed Solution

The first part was to integrate Globus as a large file transfer mechanism, into Dataverse. This was done by building on the work already done in the Borealis, the Canadian Dataverse Repository (Formerly Scholar’s Portal Fork of Dataverse) to allow Dataverse to integrate with Globus. This integration allows files bigger than a terabyte to be transferred from within Dataverse and to an S3 store via globus.

The second part addresses very large files; items up upward of a petabyte or so are not realistic for DV to store. This solution enables Dataverse to manage datasets where one or more of the files is referenced rather than being directly stored within a Dataverse repository. In this solution, the large file remains in its original location and then is referenced from Dataverse.

Acceptance Criteria

  • Discussion with Dataverse community members with related work,
  • Set up Globus environment at NESE
  • Design and implement code to call API to interact with Globus endpoints,
  • Test integration

Associated Issues:

See comments below for latest update.


┆Issue is synchronized with this Smartsheet row by Unito

Metadata

Metadata

Assignees

Labels

pm.GREIhttps://docs.google.com/document/d/1RdifpHJDFqx8Y8-Dsv_VnnTgezjNHKpSyRei4cw3C-k/edit?usp=sharingpm.GREI-d-1.1.1NIH, yr1, aim1, task1: MVP for registering metadata in the repository

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions