Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
709b4da
try async command for archiving
qqmyers Nov 24, 2025
6487c14
save status
qqmyers Nov 24, 2025
9d32051
refactor, use persistArchivalCopyLocation everywhere
qqmyers Jan 8, 2026
ec5046c
catch OLE when persisting archivalcopylocation
qqmyers Jan 12, 2026
c1055b8
Add obsolete state, update display, add supportsDelete
qqmyers Nov 25, 2025
f912fd0
doc that api doesn't handls supportsDelete yet
qqmyers Nov 25, 2025
00f115e
support reflective and instance calls re: delete capability
qqmyers Nov 25, 2025
bc40370
use query to update status, async everywhere
qqmyers Dec 10, 2025
df9b5ce
fixes for dataset page re: archiving
qqmyers Dec 12, 2025
a64e1f7
merge issues
qqmyers Jan 16, 2026
c55230e
merge fix of persistArchivalCopy method refactors
qqmyers Jan 21, 2026
905570a
add flag, docs
qqmyers Jan 22, 2026
521fbf6
add delete to local and S3
qqmyers Jan 22, 2026
ba04ba2
fix doc ref
qqmyers Jan 27, 2026
7a18669
remove errant : char
qqmyers Jan 27, 2026
ae91b78
no transaction time limit during bagging from command (not workflow)
qqmyers Jan 23, 2026
d2a25c3
use new transaction to start
qqmyers Jan 24, 2026
a45b76b
typo
qqmyers Jan 24, 2026
a4c583e
Use pending, use JSON
qqmyers Jan 24, 2026
305f7e3
merge fix of persistArchivalCopy method refactors
qqmyers Jan 21, 2026
d2282d9
combined release note
qqmyers Jan 28, 2026
236fca4
missed change to static
qqmyers Jan 28, 2026
216fda1
Merge remote-tracking branch 'IQSS/develop' into
qqmyers Feb 19, 2026
f3fb3db
switch to jvm setting
qqmyers Feb 19, 2026
e642ef2
fix param order per review
qqmyers Feb 19, 2026
9f5bb3f
443 fix per review
qqmyers Feb 19, 2026
47e199f
refactor per review
qqmyers Feb 19, 2026
a21f6a3
fix indent per review
qqmyers Feb 19, 2026
e0dad2c
fix javadoc per review
qqmyers Feb 19, 2026
f0282f3
remove param in doc per review
qqmyers Feb 19, 2026
0d0afe7
add spacename to datacite file
qqmyers Feb 19, 2026
12944a5
update test to match new name
qqmyers Feb 19, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions doc/release-notes/12122-archiving updates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
## Notifications

This release includes multiple updates to the process of creating archival bags including
- performance/scaling improvements for large datasets (multiple changes)
- bug fixes for when superusers see the "Submit" button to launch archiving from the dataset page version table
- new functionality to optionally suppress an archiving workflow when using the Update Current Version functionality and mark the current archive as out of date
- new functionality to support recreating an archival bag when Update Current Version has been used, which is available for archivers that can delete existing files
-
14 changes: 11 additions & 3 deletions doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2263,6 +2263,9 @@ At present, archiving classes include the DuraCloudSubmitToArchiveCommand, Local

All current options support the :ref:`Archival Status API` calls and the same status is available in the dataset page version table (for contributors/those who could view the unpublished dataset, with more detail available to superusers).

Archival Bags are created per dataset version. By default, if a version is republished (via the superuser-only 'Update Current Version' publication option in the UI/API), a new archival bag is not created for the version.
If the archiver used is capable of deleting existing bags (Google, S3, and File Archivers) superusers can trigger a manual update of the archival bag, and, if the :ref:`dataverse.bagit.archive-on-version-update` flag is set to true, this will be done automatically when 'Update Current Version' is used.

.. _Duracloud Configuration:

Duracloud Configuration
Expand Down Expand Up @@ -3715,6 +3718,14 @@ The email for your institution that you'd like to appear in bag-info.txt. See :r

Can also be set via *MicroProfile Config API* sources, e.g. the environment variable ``DATAVERSE_BAGIT_SOURCEORG_EMAIL``.

.. _dataverse.bagit.archive-on-version-update:

dataverse.bagit.archive-on-version-update
+++++++++++++++++++++++++++++++++++++++++

Indicates whether archival bag creation should be triggered (if configured) when a version is updated and was already successfully archived,
i.e via the Update-Current-Version publication option. Setting the flag true only works if the archiver being used supports deleting existing archival bags.

.. _dataverse.files.globus-monitoring-server:

dataverse.files.globus-monitoring-server
Expand Down Expand Up @@ -4031,9 +4042,6 @@ dataverse.feature.only-update-datacite-when-needed

Only contact DataCite to update a DOI after checking to see if DataCite has outdated information (for efficiency, lighter load on DataCite, especially when using file DOIs).




.. _:ApplicationServerSettings:

Application Server Settings
Expand Down
101 changes: 62 additions & 39 deletions src/main/java/edu/harvard/iq/dataverse/DatasetPage.java
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
import edu.harvard.iq.dataverse.engine.command.impl.UpdateDatasetVersionCommand;
import edu.harvard.iq.dataverse.export.ExportService;
import edu.harvard.iq.dataverse.util.cache.CacheFactoryBean;
import edu.harvard.iq.dataverse.util.json.JsonUtil;
import io.gdcc.spi.export.ExportException;
import io.gdcc.spi.export.Exporter;
import edu.harvard.iq.dataverse.ingest.IngestRequest;
Expand Down Expand Up @@ -102,6 +103,9 @@
import jakarta.faces.view.ViewScoped;
import jakarta.inject.Inject;
import jakarta.inject.Named;
import jakarta.json.Json;
import jakarta.json.JsonObject;
import jakarta.json.JsonObjectBuilder;
import jakarta.persistence.OptimisticLockException;

import org.apache.commons.lang3.StringUtils;
Expand Down Expand Up @@ -2990,27 +2994,38 @@ public String updateCurrentVersion() {
String className = settingsService.get(SettingsServiceBean.Key.ArchiverClassName.toString());
AbstractSubmitToArchiveCommand archiveCommand = ArchiverUtil.createSubmitToArchiveCommand(className, dvRequestService.getDataverseRequest(), updateVersion);
if (archiveCommand != null) {
// Delete the record of any existing copy since it is now out of date/incorrect
updateVersion.setArchivalCopyLocation(null);
/*
* Then try to generate and submit an archival copy. Note that running this
* command within the CuratePublishedDatasetVersionCommand was causing an error:
* "The attribute [id] of class
* [edu.harvard.iq.dataverse.DatasetFieldCompoundValue] is mapped to a primary
* key column in the database. Updates are not allowed." To avoid that, and to
* simplify reporting back to the GUI whether this optional step succeeded, I've
* pulled this out as a separate submit().
*/
try {
updateVersion = commandEngine.submit(archiveCommand);
if (!updateVersion.getArchivalCopyLocationStatus().equals(DatasetVersion.ARCHIVAL_STATUS_FAILURE)) {
successMsg = BundleUtil.getStringFromBundle("datasetversion.update.archive.success");
} else {
errorMsg = BundleUtil.getStringFromBundle("datasetversion.update.archive.failure");
//There is an archiver configured, so now decide what to dO:
// If a successful copy exists, don't automatically update, just note the old copy is obsolete (and enable the superadmin button in the display to allow a ~manual update if desired)
// If pending or an obsolete copy exists, do nothing (nominally if a pending run succeeds and we're updating the current version here, it should be marked as obsolete - ignoring for now since updates within the time an archiving run is pending should be rare
// If a failure or null, rerun archiving now. If a failure is due to an exiting copy in the repo, we'll fail again
String status = updateVersion.getArchivalCopyLocationStatus();
if((status==null) || status.equals(DatasetVersion.ARCHIVAL_STATUS_FAILURE) || (JvmSettings.BAGIT_ARCHIVE_ON_VERSION_UPDATE.lookupOptional(Boolean.class).orElse(false) && archiveCommand.canDelete())){
// Delete the record of any existing copy since it is now out of date/incorrect
JsonObjectBuilder job = Json.createObjectBuilder();
job.add(DatasetVersion.ARCHIVAL_STATUS, DatasetVersion.ARCHIVAL_STATUS_PENDING);
updateVersion.setArchivalCopyLocation(JsonUtil.prettyPrint(job.build()));
//Persist to db now
datasetVersionService.persistArchivalCopyLocation(updateVersion);
/*
* Then try to generate and submit an archival copy. Note that running this
* command within the CuratePublishedDatasetVersionCommand was causing an error:
* "The attribute [id] of class
* [edu.harvard.iq.dataverse.DatasetFieldCompoundValue] is mapped to a primary
* key column in the database. Updates are not allowed." To avoid that, and to
* simplify reporting back to the GUI whether this optional step succeeded, I've
* pulled this out as a separate submit().
*/
try {
commandEngine.submitAsync(archiveCommand);
JsfHelper.addSuccessMessage(BundleUtil.getStringFromBundle("datasetversion.archive.inprogress"));
} catch (CommandException ex) {
errorMsg = BundleUtil.getStringFromBundle("datasetversion.update.archive.failure") + " - " + ex.toString();
logger.severe(ex.getMessage());
}
} catch (CommandException ex) {
errorMsg = BundleUtil.getStringFromBundle("datasetversion.update.archive.failure") + " - " + ex.toString();
logger.severe(ex.getMessage());
} else if(status.equals(DatasetVersion.ARCHIVAL_STATUS_SUCCESS)) {
//Not automatically replacing the old archival copy as creating it is expensive
updateVersion.setArchivalStatusOnly(DatasetVersion.ARCHIVAL_STATUS_OBSOLETE);
datasetVersionService.persistArchivalCopyLocation(updateVersion);
}
}
}
Expand Down Expand Up @@ -6094,33 +6109,33 @@ public void refreshPaginator() {

/**
* This method can be called from *.xhtml files to allow archiving of a dataset
* version from the user interface. It is not currently (11/18) used in the IQSS/develop
* branch, but is used by QDR and is kept here in anticipation of including a
* GUI option to archive (already published) versions after other dataset page
* changes have been completed.
* version from the user interface.
*
* @param id - the id of the datasetversion to archive.
*/
public void archiveVersion(Long id) {
public void archiveVersion(Long id, boolean force) {
if (session.getUser() instanceof AuthenticatedUser) {
DatasetVersion dv = datasetVersionService.retrieveDatasetVersionByVersionId(id).getDatasetVersion();
String className = settingsWrapper.getValueForKey(SettingsServiceBean.Key.ArchiverClassName, null);
AbstractSubmitToArchiveCommand cmd = ArchiverUtil.createSubmitToArchiveCommand(className, dvRequestService.getDataverseRequest(), dv);
if (cmd != null) {
try {
DatasetVersion version = commandEngine.submit(cmd);
if (!version.getArchivalCopyLocationStatus().equals(DatasetVersion.ARCHIVAL_STATUS_FAILURE)) {
String status = dv.getArchivalCopyLocationStatus();
if (status == null || (force && cmd.canDelete())) {

// Set initial pending status
JsonObjectBuilder job = Json.createObjectBuilder();
job.add(DatasetVersion.ARCHIVAL_STATUS, DatasetVersion.ARCHIVAL_STATUS_PENDING);
dv.setArchivalCopyLocation(JsonUtil.prettyPrint(job.build()));
//Persist now
datasetVersionService.persistArchivalCopyLocation(dv);
commandEngine.submitAsync(cmd);

logger.info(
"DatasetVersion id=" + version.getId() + " submitted to Archive, status: " + dv.getArchivalCopyLocationStatus());
} else {
logger.severe("Error submitting version " + version.getId() + " due to conflict/error at Archive");
}
if (version.getArchivalCopyLocation() != null) {
"DatasetVersion id=" + dv.getId() + " submitted to Archive, status: " + dv.getArchivalCopyLocationStatus());
setVersionTabList(resetVersionTabList());
this.setVersionTabListForPostLoad(getVersionTabList());
JsfHelper.addSuccessMessage(BundleUtil.getStringFromBundle("datasetversion.archive.success"));
} else {
JsfHelper.addErrorMessage(BundleUtil.getStringFromBundle("datasetversion.archive.failure"));
JsfHelper.addSuccessMessage(BundleUtil.getStringFromBundle("datasetversion.archive.inprogress"));
}
} catch (CommandException ex) {
logger.log(Level.SEVERE, "Unexpected Exception calling submit archive command", ex);
Expand Down Expand Up @@ -6154,31 +6169,39 @@ public boolean isArchivable() {
return archivable;
}

/** Method to decide if a 'Submit' button should be enabled for archiving a dataset version. */
public boolean isVersionArchivable() {
if (versionArchivable == null) {
// If this dataset isn't in an archivable collection return false
versionArchivable = false;
if (isArchivable()) {
boolean checkForArchivalCopy = false;

// Otherwise, we need to know if the archiver is single-version-only
// If it is, we have to check for an existing archived version to answer the
// question
String className = settingsWrapper.getValueForKey(SettingsServiceBean.Key.ArchiverClassName, null);
if (className != null) {
try {
boolean checkForArchivalCopy = false;
Class<?> clazz = Class.forName(className);
Method m = clazz.getMethod("isSingleVersion", SettingsWrapper.class);
Method m2 = clazz.getMethod("supportsDelete");

Object[] params = { settingsWrapper };
boolean supportsDelete = (Boolean) m2.invoke(null);
checkForArchivalCopy = (Boolean) m.invoke(null, params);

if (checkForArchivalCopy) {
// If we have to check (single version archiving), we can't allow archiving if
// one version is already archived (or attempted - any non-null status)
versionArchivable = !isSomeVersionArchived();
} else {
// If we allow multiple versions or didn't find one that has had archiving run
// on it, we can archive, so return true
versionArchivable = true;
// If we didn't find one that has had archiving run
// on it, or we archiving per version is supported and either
// the status is null or the archiver can delete prior runs and status isn't success,
// we can archive, so return true
String status = workingVersion.getArchivalCopyLocationStatus();
versionArchivable = (status == null) || ((!status.equals(DatasetVersion.ARCHIVAL_STATUS_SUCCESS) && (!status.equals(DatasetVersion.ARCHIVAL_STATUS_PENDING)) && supportsDelete));
}
} catch (ClassNotFoundException | IllegalAccessException | IllegalArgumentException
| InvocationTargetException | NoSuchMethodException | SecurityException e) {
Expand Down
27 changes: 19 additions & 8 deletions src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,7 @@ public enum VersionState {
public static final String ARCHIVAL_STATUS_PENDING = "pending";
public static final String ARCHIVAL_STATUS_SUCCESS = "success";
public static final String ARCHIVAL_STATUS_FAILURE = "failure";
public static final String ARCHIVAL_STATUS_OBSOLETE = "obsolete";

@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
Expand Down Expand Up @@ -231,8 +232,9 @@ public enum VersionState {
@Transient
private DatasetVersionDifference dvd;

//The Json version of the archivalCopyLocation string
@Transient
private JsonObject archivalStatus;
private JsonObject archivalCopyLocationJson;

public Long getId() {
return this.id;
Expand Down Expand Up @@ -383,25 +385,25 @@ public String getArchivalCopyLocation() {
public String getArchivalCopyLocationStatus() {
populateArchivalStatus(false);

if(archivalStatus!=null) {
return archivalStatus.getString(ARCHIVAL_STATUS);
if(archivalCopyLocationJson!=null) {
return archivalCopyLocationJson.getString(ARCHIVAL_STATUS);
}
return null;
}
public String getArchivalCopyLocationMessage() {
populateArchivalStatus(false);
if(archivalStatus!=null) {
return archivalStatus.getString(ARCHIVAL_STATUS_MESSAGE);
if(archivalCopyLocationJson!=null && archivalCopyLocationJson.containsKey(ARCHIVAL_STATUS_MESSAGE)) {
return archivalCopyLocationJson.getString(ARCHIVAL_STATUS_MESSAGE);
}
return null;
}

private void populateArchivalStatus(boolean force) {
if(archivalStatus ==null || force) {
if(archivalCopyLocationJson ==null || force) {
if(archivalCopyLocation!=null) {
try {
archivalStatus = JsonUtil.getJsonObject(archivalCopyLocation);
} catch(Exception e) {
archivalCopyLocationJson = JsonUtil.getJsonObject(archivalCopyLocation);
} catch (Exception e) {
logger.warning("DatasetVersion id: " + id + "has a non-JsonObject value, parsing error: " + e.getMessage());
logger.fine(archivalCopyLocation);
}
Expand All @@ -414,6 +416,15 @@ public void setArchivalCopyLocation(String location) {
populateArchivalStatus(true);
}

// Convenience method to just change the status without changing the location
public void setArchivalStatusOnly(String status) {
populateArchivalStatus(false);
JsonObjectBuilder job = Json.createObjectBuilder(archivalCopyLocationJson);
job.add(DatasetVersion.ARCHIVAL_STATUS, status);
archivalCopyLocationJson = job.build();
archivalCopyLocation = JsonUtil.prettyPrint(archivalCopyLocationJson);
}

public String getDeaccessionLink() {
return deaccessionLink;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,14 @@
import jakarta.ejb.EJB;
import jakarta.ejb.EJBException;
import jakarta.ejb.Stateless;
import jakarta.ejb.TransactionAttribute;
import jakarta.ejb.TransactionAttributeType;
import jakarta.inject.Named;
import jakarta.json.Json;
import jakarta.json.JsonObjectBuilder;
import jakarta.persistence.EntityManager;
import jakarta.persistence.NoResultException;
import jakarta.persistence.OptimisticLockException;
import jakarta.persistence.PersistenceContext;
import jakarta.persistence.Query;
import jakarta.persistence.TypedQuery;
Expand Down Expand Up @@ -1333,4 +1336,24 @@ public Long getDatasetVersionCount(Long datasetId, boolean canViewUnpublishedVer

return em.createQuery(cq).getSingleResult();
}


/**
* Update the archival copy location for a specific version of a dataset.
* Archiving can be long-running and other parallel updates to the datasetversion have likely occurred
* so this method will just re-find the version rather than risking an
* OptimisticLockException and then having to retry in yet another transaction (since the OLE rolls this one back).
*
* @param dv
* The dataset version whose archival copy location we want to update. Must not be {@code null}.
*/
@TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
public void persistArchivalCopyLocation(DatasetVersion dv) {
DatasetVersion currentVersion = find(dv.getId());
if (currentVersion != null) {
currentVersion.setArchivalCopyLocation(dv.getArchivalCopyLocation());
} else {
logger.log(Level.SEVERE, "Could not find DatasetVersion with id={0} to retry persisting archival copy location after OptimisticLockException.", dv.getId());
}
}
}
Loading