-
Notifications
You must be signed in to change notification settings - Fork 535
Support partially unzipped archival bags #12144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
qqmyers
wants to merge
50
commits into
IQSS:develop
Choose a base branch
from
GlobalDataverseCommunityConsortium:DANS-2157_holey_bags3
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,474
−397
Open
Changes from all commits
Commits
Show all changes
50 commits
Select commit
Hold shift + click to select a range
c9f728b
add checksum URI values and methods
qqmyers a25e47b
update version and use checksum URIs
qqmyers 6c0cb49
handle multiline descriptions and org names
qqmyers 7a34db8
drop blank lines in multiline values
qqmyers b0daad7
remove title as a folder
qqmyers e5457a8
handle null deaccession reason
qqmyers 10b0556
use static to simplify testing
qqmyers d6cf1e2
Merge remote-tracking branch 'IQSS/develop' into OREBag1.0.2
qqmyers 6d24185
Sanitize/split multiline catalog entry, add Dataverse-Bag-Version
qqmyers c4daf28
Added unit tests for multilineWrap
janvanmansum e76bc91
Removed unnecessary repeat helper method
janvanmansum 108c912
Alined test names with actual test being done
janvanmansum 62ea9d9
Merge pull request #48 from janvanmansum/OREBag1.0.2-amend
qqmyers 884b81b
DD-2098 - allow archivalstatus calls on deaccessioned versions
qqmyers 5e4e90a
Merge remote-tracking branch 'IQSS/develop' into OREBag1.0.2
qqmyers 3076d69
set array properly
qqmyers cbdc15f
Merge remote-tracking branch 'IQSS/develop' into OREBag1.0.2
qqmyers 1a7dafa
DD-2212 - use configured checksum when no files are present
qqmyers 7eea57c
Revert "DD-2098 - allow archivalstatus calls on deaccessioned versions"
qqmyers 2477cf9
add Source-Org as a potential multiline case, remove change to Int Id
qqmyers 3f3908f
release note
qqmyers aa44c08
use constants, pass labelLength to wrapping, start custom lineWrap
qqmyers 8227edf
update to handle overall 79 char length
qqmyers d0749fc
wrap any other potentially long values
qqmyers 24a625f
cleanup deprecated code, auto-gen comments
qqmyers bf036f3
update comment
qqmyers be65611
add tests
qqmyers 2516cf4
Merge remote-tracking branch 'IQSS/develop' into OREBag1.0.2
qqmyers 24d098a
QDR updates to apache 5, better fault tolerance for file retrieval
qqmyers b4a3799
release note update
qqmyers 85a5239
Merge branch 'develop' into OREBag1.0.2
qqmyers e461415
Merge remote-tracking branch 'IQSS/develop' into OREBag1.0.2
qqmyers 1b42978
suppress counting file retrieval to bag as a download in gb table
qqmyers 56de8cb
Merge branch 'OREBag1.0.2' of https://github.com/GlobalDataverseCommu…
qqmyers 3083179
Merge remote-tracking branch 'IQSS/develop' into OREBag1.0.2
qqmyers 49f4818
basic fetch
qqmyers 7f5179f
order by file size
qqmyers bc63285
only add subcollection folders (if they exist)
qqmyers 59f3a2a
replace deprecated constructs
qqmyers 69c9a0d
restore name collision check
qqmyers 422435a
add null check to quiet log/avoid exception
qqmyers d9cfe1d
cleanup - checksum change
qqmyers 4895f80
cleanup, suppress downloads with gbrec for fetch file
qqmyers 62a03b2
add setting, refactor, for non-holey option
qqmyers 637b2e3
Update to track non-zipped files, add method
qqmyers a6b0505
reuse stream supplier, update archivers to send oversized files
qqmyers 5739e35
docs, release note update
qqmyers 5c82ab8
style fix
qqmyers b0be6a1
Merge remote-tracking branch 'IQSS/develop' into DANS-2157_holey_bags3
qqmyers 6f7e0ec
use constants
qqmyers File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| This release contains multiple updates to the OAI-ORE metadata export and archival Bag output: | ||
|
|
||
| OAI-ORE | ||
| - now uses URI for checksum algorithms | ||
| - a bug causing failures with deaccessioned versions when the deaccession note ("Deaccession Reason" in the UI) was null (which has been allowed via the API). | ||
| - the "https://schema.org/additionalType" is updated to "Dataverse OREMap Format v1.0.2" to indicate that the out has changed | ||
|
|
||
| Archival Bag | ||
| - for dataset versions with no files, the (empty) manifest-<alg>.txt file created will now use the default algorithm defined by the "FileFixityChecksumAlgorithm" setting rather than always defaulting to "md5" | ||
| - a bug causing the bag-info.txt to not have information on contacts when the dataset version has more than one contact has been fixed | ||
| - values used in the bag-info.txt file that may be multi-line (with embedded CR or LF characters) are now properly indented/formatted per the BagIt specification (i.e. Internal-Sender-Identifier, External-Description, Source-Organization, Organization-Address). | ||
| - the name of the dataset is no longer used as a subdirectory under the data directory (dataset names can be long enough to cause failures when unzipping) | ||
| - a new key, "Dataverse-Bag-Version" has been added to bag-info.txt with a value "1.0", allowing tracking of changes to Dataverse's arhival bag generation | ||
| - improvements to file retrieval w.r.t. retries on errors or throttling | ||
| - retrieval of files for inclusion in the bag is no longer counted as a download by Dataverse |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| This release contains multiple updates to the OAI-ORE metadata export and archival Bag output: | ||
|
|
||
| OAI-ORE | ||
| - now uses URI for checksum algorithms | ||
| - a bug causing failures with deaccessioned versions when the deaccession note ("Deaccession Reason" in the UI) was null (which has been allowed via the API). | ||
| - the "https://schema.org/additionalType" is updated to "Dataverse OREMap Format v1.0.2" to indicate that the out has changed | ||
|
|
||
| Archival Bag | ||
| - for dataset versions with no files, the (empty) manifest-<alg>.txt file created will now use the default algorithm defined by the "FileFixityChecksumAlgorithm" setting rather than always defaulting to "md5" | ||
| - a bug causing the bag-info.txt to not have information on contacts when the dataset version has more than one contact has been fixed | ||
| - values used in the bag-info.txt file that may be multi-line (with embedded CR or LF characters) are now properly indented/formatted per the BagIt specification (i.e. Internal-Sender-Identifier, External-Description, Source-Organization, Organization-Address). | ||
| - the name of the dataset is no longer used as a subdirectory under the data directory (dataset names can be long enough to cause failures when unzipping) | ||
| - a new key, "Dataverse-Bag-Version" has been added to bag-info.txt with a value "1.0", allowing tracking of changes to Dataverse's arhival bag generation | ||
| - improvements to file retrieval w.r.t. retries on errors or throttling | ||
| - retrieval of files for inclusion in the bag is no longer counted as a download by Dataverse | ||
| - the size of data files and total dataset size that will be included in an archival bag can now be limited. Admins can choose whether files above these limits are transferred along with the zipped bag (creating a complete archival copy) or are just referenced (using the concept of a "holey" bag and just listing the oversized files and the Dataverse urls from which they can be retrieved. In the holey bag case, an active service on the archiving platform must retrieve the oversized files (using appropriate credentials as needed) to make a complete copy | ||
|
|
||
| ### New JVM Options (MicroProfile Config Settings) | ||
| dataverse.bagit.zip.holey | ||
| dataverse.bagit.zip.max-data-size | ||
| dataverse.bagit.zip.max-file-size | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.