Skip to content

Conversation

@qqmyers
Copy link
Member

@qqmyers qqmyers commented Sep 19, 2025

What this PR does / why we need it: This PR improves scaling of requests to DataCite in two ways:

  • adds retries with a delay if/when DataCite responds with error codes that indicate requests are being throttled (429) or their server is temporarily not responding (503, 504)
  • optionally checks with DataCite to see if updates are needed before sending updates

The former is fairly straight forward - rather than failing immediately, Dataverse will wait/temporarily slow requests to see if DataCite recovers/Dataverse can drop below the rate limit. If things recover, Dataverse's operations will succeed. If not, there could be a delay of ~ 1minute before a final error occurs and the operation fails.

The latter is perhaps more controversial (there was discussion several years ago about whether this is useful): instead of always sending an update, causing DataCite to write info, this optional change causes Dataverse to first query DataCite (a read) and only send an update if the local info is different than what DataCite has. In cases such as file DOIs where changes are infrequent, this results in many reads and few writes instead of many writes and DataCite (and growing records as they track all writes of new metadata) which appears to be faster. It may be generally useful, but installations not using file DOIs may not want to try it.

Which issue(s) this PR closes:

  • Closes #

Special notes for your reviewer: QDR had trouble publishing a dataset with >10K files before this change and succeeded after.

Suggestions on how to test this: Minimally regression test (w/ and w/o flag).

Could also attempt to create/publish a dataset with file DOIs and many files using the DataCite test server and see if the changes increase the success rate/largest size that succeeds and/or improves performance (i.e. with the flag on.) I'm not sure this is worth it given the testing/deployment at QDR.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

Additional documentation:

@qqmyers qqmyers added the Size: 3 A percentage of a sprint. 2.1 hours. label Sep 19, 2025
@qqmyers qqmyers added this to the 6.9 milestone Sep 19, 2025
@coveralls
Copy link

coveralls commented Sep 25, 2025

Coverage Status

coverage: 23.769% (-0.009%) from 23.778%
when pulling 5f51d92 on QualitativeDataRepository:QDR-DCiteScaling
into 53bff4c on IQSS:develop.

Copy link
Contributor

@landreev landreev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I want this deployed at HDV ahead of 6.9 for sure.

@github-project-automation github-project-automation bot moved this from Ready for Review ⏩ to Ready for QA ⏩ in IQSS Dataverse Project Oct 29, 2025
@ofahimIQSS ofahimIQSS self-assigned this Oct 30, 2025
@ofahimIQSS ofahimIQSS moved this from Ready for QA ⏩ to QA ✅ in IQSS Dataverse Project Oct 30, 2025
@cmbz cmbz added the FY26 Sprint 9 FY26 Sprint 9 (2025-10-22 - 2025-11-05) label Nov 5, 2025
@cmbz cmbz assigned landreev and unassigned ofahimIQSS Nov 5, 2025
@cmbz cmbz added the FY26 Sprint 10 FY26 Sprint 10 (2025-11-05 - 2025-11-19) label Nov 5, 2025
@landreev
Copy link
Contributor

Looks great, merging.
I did not drive into actually triggering Datacite rate limit/429s. But the code looks bulletproof.

@landreev landreev merged commit f3395cc into IQSS:develop Nov 17, 2025
15 checks passed
@github-project-automation github-project-automation bot moved this from QA ✅ to Merged 🚀 in IQSS Dataverse Project Nov 17, 2025
@scolapasta scolapasta moved this from Merged 🚀 to Done 🧹 in IQSS Dataverse Project Nov 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

FY26 Sprint 9 FY26 Sprint 9 (2025-10-22 - 2025-11-05) FY26 Sprint 10 FY26 Sprint 10 (2025-11-05 - 2025-11-19) Size: 3 A percentage of a sprint. 2.1 hours.

Projects

Status: Done 🧹

Development

Successfully merging this pull request may close these issues.

5 participants