Skip to content

Comments

feat(upload): make upload file storage path configurable #6656#8983

Merged
kcondon merged 6 commits intoIQSS:developfrom
poikilotherm:6656-upload-dir
Dec 5, 2022
Merged

feat(upload): make upload file storage path configurable #6656#8983
kcondon merged 6 commits intoIQSS:developfrom
poikilotherm:6656-upload-dir

Conversation

@poikilotherm
Copy link
Contributor

@poikilotherm poikilotherm commented Sep 20, 2022

What this PR does / why we need it:

As outlined in #6656, files will be stored in domaindir/generated/jsp/dataverse during upload before being moved to our temporary ingest file space at $dataverse.files.directory/temp.

With this commit, we enable to configure a different place for these kind of generated temporary files by using MPCONFIG variable substitution inside of glassfish-web.xml. This is kinda hacky, but at least it allows to store the data in distinguished places.

Also sorts the content of glassfish-web.xml into order as specified by the XSD. Documentation of the setting is provided.

Which issue(s) this PR closes:

Closes #6656

Special notes for your reviewer:
Please see screenshots below. You can also try with docker-aio.
I added a few notes about temporary upload storage to the configuration guide with the gist of knowledge from #6656 so admins can learn about potential problems with large file uploads.

The container base image in #8933 certainly will include something using this if this PR would be merged before. Otherwise I'd extend this PR to include something for the container scenario.

Suggestions on how to test this:
Configure your deployment with a system property pointing dataverse.files.uploads to a different place and watch data being stored there.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Nope.

Is there a release notes update needed for this change?:
Do we need one? This might be a scenario for admins as outlined by a bunch of folks. Please leave a comment.

Additional documentation:
🔋 included.

As outlined in IQSS#6656, files will be stored in
`domaindir/generated/jsp/dataverse` during upload before being moved to
our temporary ingest file space at `$dataverse.files.directory/temp`.

With this commit, we enable to configure a different place for these
kind of generated temporary files by using MPCONFIG variable
substitution inside of glassfish-web.xml.

Also sorts the content of glassfish-web.xml into order as specified by
the XSD. Documentation of the setting is provided.
@poikilotherm poikilotherm self-assigned this Sep 20, 2022
@mreekie mreekie added the bk2211 label Nov 1, 2022
poikilotherm added a commit to poikilotherm/dataverse that referenced this pull request Nov 4, 2022
- Instead of a /docroot, add a more generic /dv which is owned by payara:payara
  and can be used to either store data in a single volume using subfolders or
  use subfolders with different backing volumes. Anyway, data is not written
  to overlay FS this way. (As long as an app image points to this location)
- Also define /secrets and /dumps as volumes, so data flowing into these
  locations is again not added to the overlay FS (which might cause severe
  damage in case of heap dumps!)
- Document the different locations in the base image guide.
- Remove the /docroot workaround for uploaded files. This will be solved at application
  level (either by moving the workaround there) or IQSS#8983
@poikilotherm
Copy link
Contributor Author

poikilotherm commented Nov 11, 2022

I tested this today in docker-aio. Here's how it went.

UI Upload - 1 GiB random data :: Dataset 10.5072/FK2/QACLPF :: Default location (./uploads)

The ./uploads directory was auto-created by Payara just like the other generated data structures.

  1. Upload in progress:
    image
  2. Upload finished, data moved to <dataverse.file.directory>/temp:
    image
  3. After hitting "Done", data is moved to final destination:
    image

UI Upload - 1 GiB random data :: Dataset 10.5072/FK2/QACLPF :: New location /uploads

Created the /uploads folder in Dockerfile at build. Added DATAVERSE_FILES_UPLOADS=/uploads as environment variable there, too.

  1. Upload in progress:
    image
  2. All other steps are the same as above, skipping these images.

Looks like this works smoothly!

@poikilotherm poikilotherm marked this pull request as ready for review November 11, 2022 10:18
@poikilotherm poikilotherm removed their assignment Nov 11, 2022
@landreev
Copy link
Contributor

This looks great, we should be able to move it along and merge it in the next sprint.
Thank you! (I do feel BAD about not having gotten around to doing this ourselves ☹️)

@pdurbin pdurbin added the Size: 30 A percentage of a sprint. 21 hours. (formerly size:33) label Nov 29, 2022
@pdurbin
Copy link
Member

pdurbin commented Nov 29, 2022

Small code change. Good docs. Gave it a size of 33.
anyone can do this.
Phil is already working this
Leonid too.

Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks, @poikilotherm!! Approved (I made small doc changes). @landreev also made a positive comment: #8983 (comment)

The only testing I did was to confirm that the new unconfigured JSF upload default directory is /usr/local/payara5/glassfish/domains/domain1/uploads. I assuming configuration works fine.

Moving to QA.

@kcondon kcondon self-assigned this Dec 5, 2022
@kcondon kcondon merged commit ac2b31d into IQSS:develop Dec 5, 2022
@pdurbin pdurbin added this to the 5.13 milestone Dec 7, 2022
@mreekie mreekie mentioned this pull request Jan 4, 2023
@poikilotherm poikilotherm deleted the 6656-upload-dir branch May 21, 2023 19:15
@cmbz cmbz added the FY26 Sprint 4 FY26 Sprint 4 (2025-08-13 - 2025-08-27) label Aug 16, 2025
@cmbz cmbz added the FY26 Sprint 14 FY26 Sprint 14 (2025-12-31 - 2026-01-14) label Jan 14, 2026
@cmbz cmbz removed the FY26 Sprint 14 FY26 Sprint 14 (2025-12-31 - 2026-01-14) label Jan 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

FY26 Sprint 4 FY26 Sprint 4 (2025-08-13 - 2025-08-27) Size: 30 A percentage of a sprint. 21 hours. (formerly size:33)

Projects

Status: No status
Status: Valuation

Development

Successfully merging this pull request may close these issues.

Refactor file upload from web UI and temporary storage

6 participants