Globus patch - no need to lock the dataset while an upload is in progress, when tasks are monitored asynchronously.#11971
Open
Globus patch - no need to lock the dataset while an upload is in progress, when tasks are monitored asynchronously.#11971
Conversation
…ts and/or further uploads (when the async. task management mode is enabled).
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
src/main/webapp/dataset.xhtml
Outdated
| </c:if> | ||
| <!--c:if test="#{(showSubmitForReviewLink or showReturnToAuthorLink) and showPublishLink and DatasetPage.lockedFromPublishing}" --> | ||
| <!-- f:passThroughAttribute name="class" value="btn btn-default btn-access btn-publish dropdown-toggle disabled"/ --> | ||
| <!-- /c:if --> |
Contributor
There was a problem hiding this comment.
Is there a reason for adding this commented out code?
I'll approve it.
stevenwinship
approved these changes
Nov 12, 2025
Contributor
Author
|
Added "how to test" info and size 10. |
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
14 tasks
be in use by other tasks are not removed when upload transfers are processed.
Contributor
Author
|
I noticed that this PR was still sitting in "ready for QA" (unfortunately). But I've used the opportunity to add another fix/improvement to it. Similarly to everything else in this PR it's already deployed in prod. at HDV. |
This comment has been minimized.
This comment has been minimized.
|
📦 Pushed preview images as 🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name. |
…rocessing 2 completed Globus transfers results in adding files to the same dataset back to back. This is a real, practical condition, when the queue processing gets stuck and then a number of completed tasks accumulated over some time get processed all at once sequentially. Note that I had to abandon a lambda .forEach() notation since I needed a non-final lastDataset variable in the loop.
Contributor
Author
|
As I mentioned earlier, putting this on hold, pending dataverse-internal coming back to life. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
Globus uploads have been handled the same way as ingests - a dedicated lock is placed on the dataset for the duration preventing any further transfers or edits. This does not appear to be necessary when the asynchronous, database queue-based task monitoring mode is enabled. Since the whole point of Globus support, at HDV at least, is for handling larger/TB-sized data these transfers can take a long time (days potentially) and keeping the dataset locked further complicates an already cumbersome workflow.
Which issue(s) this PR closes:
There's no corresponding issue as of now, this started as a production patch.
Special notes for your reviewer:
I still have no idea how to go about creating meaningful tests for any Globus-related functionality. Any feedback is welcome.
Suggestions on how to test this:
This can be tested on one of the instances where Globus storage is configured: demo and dataverse-internal.
In a collection with a Globus storage volume assigned, starting a long-ish running Globus upload (will need to be something in at least 10s of MBs; and this is definitely a PR that will be easier to test from home, since transfers are obscenely fast between NESE and Harvard local networks). With this build, it should be possible to start another Globus transfer; the "add files" and all the other buttons except for "Publish Dataset" should stay enabled for the duration.
A finer test will be to stack multiple simultaneous transfers, and confirm that a) the above is still true b) that the Publish button will stay disable for as long as at least one transfer is still active, but becomes enabled again once the last one finishes. Similarly, the message about publishing being disabled should disappear at the end.
I suggest not to actually try and publish the dataset; simply because then you will be able to delete the draft, and have the files stored at NESE permanently erased in the process. Even though it's a tape volume dedicated to testing that demo and internal are configured to use, it's still prudent not to leave junk on it unnecessarily.
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Is there a release notes update needed for this change?:
Additional documentation: