Skip to content

Comments

Only Archive in Sequence#12129

Open
qqmyers wants to merge 9 commits intoIQSS:developfrom
GlobalDataverseCommunityConsortium:DANS-2097
Open

Only Archive in Sequence#12129
qqmyers wants to merge 9 commits intoIQSS:developfrom
GlobalDataverseCommunityConsortium:DANS-2097

Conversation

@qqmyers
Copy link
Member

@qqmyers qqmyers commented Jan 29, 2026

What this PR does / why we need it: For back-end stores using something like Oxford Common File Layout to deduplicate files across versions, the archival info for all prior dataset versions has to be in place before you can add the next/latest version. This PR adds a feature flag to turn on this behavior and adds code to check/fail if earlier versions haven't successfully been archived. It also avoids showing a submit button for a version in the dataset page version table if there's no chance of success (because earlier versions aren't archived).

Which issue(s) this PR closes:

  • Closes #
    DANS issue 2097

Special notes for your reviewer:

Suggestions on how to test this: Set the flag, verify the datasetpage version table doesn't show a submit button for v2+ if you haven't archived v1. Verify an archiving post-publish workflow fails for new versions if v1 hasn't succeeded. (Probably easiest to delete the archival status of v1 via API, but could just configure the workflow after publishing v1).

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

Additional documentation:

Preview docs at https://dataverse-guide--12129.org.readthedocs.build/en/12129/installation/config.html

@qqmyers qqmyers changed the title Dans 2097 Only Archive in Sequence Jan 29, 2026
@qqmyers qqmyers added this to the 6.10 milestone Feb 3, 2026
@qqmyers qqmyers added the Size: 10 A percentage of a sprint. 7 hours. label Feb 3, 2026
@qqmyers qqmyers marked this pull request as ready for review February 3, 2026 11:08
@qqmyers qqmyers added GDCC: DANS related to GDCC work for DANS TDL of interest to the Texas Digital Library GDCC: QDR of interest to QDR labels Feb 3, 2026
@cmbz cmbz moved this to Ready for Review ⏩ in IQSS Dataverse Project Feb 11, 2026
@cmbz cmbz added the FY26 Sprint 17 FY26 Sprint 17 (2026-02-11 - 2026-02-25) label Feb 11, 2026
@pdurbin pdurbin self-assigned this Feb 24, 2026
@pdurbin pdurbin moved this from Ready for Review ⏩ to In Review 🔎 in IQSS Dataverse Project Feb 24, 2026
Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks ok to me. I did leave one comment.

API tests are failing due to this:

private boolean usetemp = false;

private int numConnections = 8;
private static int numConnections = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the https://dataverse-guide--12129.org.readthedocs.build/en/12129/installation/config.html#bagit-export section, I note that higher values may cause more issues with throttling. I've added some exponential back-off and retries that seem sufficient to get through 10K files with 2 threads. It would probably need more aggressive/configurable back-off for more threads, but, since the main reason to thread is to go faster, it probably is a bad idea to add threads and then have them wait more/end up retrying more. So - lower default to handle throttling and people should be fine to go up if their servers aren't throttled.

@qqmyers qqmyers removed their assignment Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

FY26 Sprint 17 FY26 Sprint 17 (2026-02-11 - 2026-02-25) GDCC: DANS related to GDCC work for DANS GDCC: QDR of interest to QDR Size: 10 A percentage of a sprint. 7 hours. TDL of interest to the Texas Digital Library

Projects

Status: In Review 🔎

Development

Successfully merging this pull request may close these issues.

3 participants