From 2197db926848154285f761d4926ea2651dc622fd Mon Sep 17 00:00:00 2001 From: Danny Brooke Date: Mon, 6 Dec 2021 08:46:00 -0500 Subject: [PATCH 01/18] initial doc creation --- doc/release-notes/5.9-release-notes.md | 112 +++++++++++++++++++++++++ 1 file changed, 112 insertions(+) create mode 100644 doc/release-notes/5.9-release-notes.md diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md new file mode 100644 index 00000000000..b452739e161 --- /dev/null +++ b/doc/release-notes/5.9-release-notes.md @@ -0,0 +1,112 @@ +# Dataverse Software 5.8 + +This release brings new features, enhancements, and bug fixes to the Dataverse Software. Thank you to all of the community members who contributed code, suggestions, bug reports, and other assistance across the project. + +## Release Highlights + +### Support for Data Embargoes + +The Dataverse Software now supports file-level embargoes. The ability to set embargoes, up to a maximum duration (in months), can be configured by a Dataverse installation administrator. For more information, see the [Embargoes section](https://guides.dataverse.org/en/5.8/user/dataset-management.html#embargoes) of the Dataverse Software Guides. + +- Users can configure a specific embargo, defined by an end date and a short reason, on a set of selected files or an individual file, by selecting the 'Embargo' menu item and entering information in a popup dialog. Embargoes can only be set, changed, or removed before a file has been published. After publication, only Dataverse installation administrators can make changes, using an API. + +- While embargoed, files cannot be previewed or downloaded (as if restricted, with no option to allow access requests). After the embargo expires, files become accessible. If the files were also restricted, they remain inaccessible and functionality is the same as for any restricted file. + +- By default, the citation date reported for the dataset and the datafiles in version 1.0 reflects the longest embargo period on any file in version 1.0, which is consistent with recommended practice from DataCite. Administrators can still specify an alternate date field to be used in the citation date via the [Set Citation Date Field Type for a Dataset API Call](https://guides.dataverse.org/en/5.8/api/native-api.html#set-citation-date-field-type-for-a-dataset). + +The work to add this functionality was initiated by Data Archiving and Networked Services (DANS-KNAW), the Netherlands. It was further developed by the Global Dataverse Community Consortium (GDCC) in cooperation with and with funding from DANS. + +## Major Use Cases and Infrastructure Enhancements + +Newly-supported major use cases in this release include: + +- Users can set file-level embargoes. (Issue #7743, #4052, #343, PR #8020) +- Improved accessibility of form labels on the advanced search page (Issue #8169, PR #8170) + +## Notes for Dataverse Installation Administrators + +### Mitigate Solr Schema Management Problems + +With [Release 5.5](https://github.com/IQSS/dataverse/releases/tag/v5.5), the `` definitions had been reincluded into `schema.xml` to fix searching for datasets. + +This release includes a final update to `schema.xml` and a new script `update-fields.sh` to manage your custom metadata fields, and to provide opportunities for other future improvements. The broken script `updateSchemaMDB.sh` has been removed. + +You will need to replace your schema.xml with the one provided in order to make sure that the new script can function. If you do not use any custom metadata blocks in your installation, this is the only change to be made. If you do use custom metadata blocks you will need to take a few extra steps, enumerated in the step-by-step instructions below. + +## New JVM Options and DB Settings + +- :MaxEmbargoDurationInMonths controls whether embargoes are allowed in a Dataverse instance and can limit the maximum duration users are allowed to specify. A value of 0 months or non-existent setting indicates embargoes are not supported. A value of -1 allows embargoes of any length. + +## Complete List of Changes + +For the complete list of code changes in this release, see the [5.8 Milestone](https://github.com/IQSS/dataverse/milestone/99?closed=1) in Github. + +For help with upgrading, installing, or general questions please post to the [Dataverse Community Google Group](https://groups.google.com/forum/#!forum/dataverse-community) or email support@dataverse.org. + +## Installation + +If this is a new installation, please see our [Installation Guide](https://guides.dataverse.org/en/5.8/installation/). Please also contact us to get added to the [Dataverse Project Map](https://guides.dataverse.org/en/5.8/installation/config.html#putting-your-dataverse-installation-on-the-map-at-dataverse-org) if you have not done so already. + +## Upgrade Instructions + +0\. These instructions assume that you've already successfully upgraded from Dataverse Software 4.x to Dataverse Software 5 following the instructions in the [Dataverse Software 5 Release Notes](https://github.com/IQSS/dataverse/releases/tag/v5.0). After upgrading from the 4.x series to 5.0, you should progress through the other 5.x releases before attempting the upgrade to 5.8. + +If you are running Payara as a non-root user (and you should be!), **remember not to execute the commands below as root**. Use `sudo` to change to that user first. For example, `sudo -i -u dataverse` if `dataverse` is your dedicated application user. + +In the following commands we assume that Payara 5 is installed in `/usr/local/payara5`. If not, adjust as needed. + +`export PAYARA=/usr/local/payara5` + +(or `setenv PAYARA /usr/local/payara5` if you are using a `csh`-like shell) + +1\. Undeploy the previous version. + +- `$PAYARA/bin/asadmin list-applications` +- `$PAYARA/bin/asadmin undeploy dataverse<-version>` + +2\. Stop Payara and remove the generated directory + +- `service payara stop` +- `rm -rf $PAYARA/glassfish/domains/domain1/generated` + +3\. Start Payara + +- `service payara start` + +4\. Deploy this version. + +- `$PAYARA/bin/asadmin deploy dataverse-5.8.war` + +5\. Restart payara + +- `service payara stop` +- `service payara start` + +6\. Update Solr schema.xml. + +`/usr/local/solr/solr-8.8.1/server/solr/collection1/conf` is used in the examples below as the location of your Solr schema. Please adapt it to the correct location, if different in your installation. Use `find / -name schema.xml` if in doubt. + +6a\. Replace `schema.xml` with the base version included in this release. + +``` + wget https://github.com/IQSS/dataverse/releases/download/v5.8/schema.xml + cp schema.xml /usr/local/solr/solr-8.8.1/server/solr/collection1/conf +``` + +For installations that are not using any Custom Metadata Blocks, **you can skip the next step**. + +6b\. For installations with Custom Metadata Blocks + +Use the script provided in the release to add the custom fields to the base `schema.xml` installed in the previous step. + +``` + wget https://github.com/IQSS/dataverse/releases/download/v5.8/update-fields.sh + chmod +x update-fields.sh + curl "http://localhost:8080/api/admin/index/solr/schema" | ./update-fields.sh /usr/local/solr/solr-8.8.1/server/solr/collection1/conf/schema.xml +``` + +(Note that the curl command above calls the admin api on `localhost` to obtain the list of the custom fields. In the unlikely case that you are running the main Dataverse Application and Solr on different servers, generate the `schema.xml` on the application node, then copy it onto the Solr server) + +7\. Restart Solr + +Usually `service solr stop; service solr start`, but may be different on your system. See the [Installation Guide](https://guides.dataverse.org/en/5.8/installation/prerequisites.html#solr-init-script) for more details. From 69c2e3a608023e533cbcaecaf2000d9aec7cc165 Mon Sep 17 00:00:00 2001 From: Danny Brooke Date: Mon, 6 Dec 2021 08:53:44 -0500 Subject: [PATCH 02/18] in with the new out with the old --- doc/release-notes/5.9-release-notes.md | 39 ++++++++------------------ 1 file changed, 12 insertions(+), 27 deletions(-) diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index b452739e161..0549877f516 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -1,55 +1,40 @@ -# Dataverse Software 5.8 +# Dataverse Software 5.9 This release brings new features, enhancements, and bug fixes to the Dataverse Software. Thank you to all of the community members who contributed code, suggestions, bug reports, and other assistance across the project. ## Release Highlights -### Support for Data Embargoes - -The Dataverse Software now supports file-level embargoes. The ability to set embargoes, up to a maximum duration (in months), can be configured by a Dataverse installation administrator. For more information, see the [Embargoes section](https://guides.dataverse.org/en/5.8/user/dataset-management.html#embargoes) of the Dataverse Software Guides. - -- Users can configure a specific embargo, defined by an end date and a short reason, on a set of selected files or an individual file, by selecting the 'Embargo' menu item and entering information in a popup dialog. Embargoes can only be set, changed, or removed before a file has been published. After publication, only Dataverse installation administrators can make changes, using an API. - -- While embargoed, files cannot be previewed or downloaded (as if restricted, with no option to allow access requests). After the embargo expires, files become accessible. If the files were also restricted, they remain inaccessible and functionality is the same as for any restricted file. - -- By default, the citation date reported for the dataset and the datafiles in version 1.0 reflects the longest embargo period on any file in version 1.0, which is consistent with recommended practice from DataCite. Administrators can still specify an alternate date field to be used in the citation date via the [Set Citation Date Field Type for a Dataset API Call](https://guides.dataverse.org/en/5.8/api/native-api.html#set-citation-date-field-type-for-a-dataset). - -The work to add this functionality was initiated by Data Archiving and Networked Services (DANS-KNAW), the Netherlands. It was further developed by the Global Dataverse Community Consortium (GDCC) in cooperation with and with funding from DANS. +### XXX ## Major Use Cases and Infrastructure Enhancements Newly-supported major use cases in this release include: -- Users can set file-level embargoes. (Issue #7743, #4052, #343, PR #8020) -- Improved accessibility of form labels on the advanced search page (Issue #8169, PR #8170) +- XXX (Issue #XXX, PR #XXX) ## Notes for Dataverse Installation Administrators ### Mitigate Solr Schema Management Problems -With [Release 5.5](https://github.com/IQSS/dataverse/releases/tag/v5.5), the `` definitions had been reincluded into `schema.xml` to fix searching for datasets. - -This release includes a final update to `schema.xml` and a new script `update-fields.sh` to manage your custom metadata fields, and to provide opportunities for other future improvements. The broken script `updateSchemaMDB.sh` has been removed. - -You will need to replace your schema.xml with the one provided in order to make sure that the new script can function. If you do not use any custom metadata blocks in your installation, this is the only change to be made. If you do use custom metadata blocks you will need to take a few extra steps, enumerated in the step-by-step instructions below. +XXX ## New JVM Options and DB Settings -- :MaxEmbargoDurationInMonths controls whether embargoes are allowed in a Dataverse instance and can limit the maximum duration users are allowed to specify. A value of 0 months or non-existent setting indicates embargoes are not supported. A value of -1 allows embargoes of any length. +- :Settingname XXX ## Complete List of Changes -For the complete list of code changes in this release, see the [5.8 Milestone](https://github.com/IQSS/dataverse/milestone/99?closed=1) in Github. +For the complete list of code changes in this release, see the [5.9 Milestone](https://github.com/IQSS/dataverse/milestone/100?closed=1) in Github. For help with upgrading, installing, or general questions please post to the [Dataverse Community Google Group](https://groups.google.com/forum/#!forum/dataverse-community) or email support@dataverse.org. ## Installation -If this is a new installation, please see our [Installation Guide](https://guides.dataverse.org/en/5.8/installation/). Please also contact us to get added to the [Dataverse Project Map](https://guides.dataverse.org/en/5.8/installation/config.html#putting-your-dataverse-installation-on-the-map-at-dataverse-org) if you have not done so already. +If this is a new installation, please see our [Installation Guide](https://guides.dataverse.org/en/5.9/installation/). Please also contact us to get added to the [Dataverse Project Map](https://guides.dataverse.org/en/5.9/installation/config.html#putting-your-dataverse-installation-on-the-map-at-dataverse-org) if you have not done so already. ## Upgrade Instructions -0\. These instructions assume that you've already successfully upgraded from Dataverse Software 4.x to Dataverse Software 5 following the instructions in the [Dataverse Software 5 Release Notes](https://github.com/IQSS/dataverse/releases/tag/v5.0). After upgrading from the 4.x series to 5.0, you should progress through the other 5.x releases before attempting the upgrade to 5.8. +0\. These instructions assume that you've already successfully upgraded from Dataverse Software 4.x to Dataverse Software 5 following the instructions in the [Dataverse Software 5 Release Notes](https://github.com/IQSS/dataverse/releases/tag/v5.0). After upgrading from the 4.x series to 5.0, you should progress through the other 5.x releases before attempting the upgrade to 5.9. If you are running Payara as a non-root user (and you should be!), **remember not to execute the commands below as root**. Use `sudo` to change to that user first. For example, `sudo -i -u dataverse` if `dataverse` is your dedicated application user. @@ -75,7 +60,7 @@ In the following commands we assume that Payara 5 is installed in `/usr/local/pa 4\. Deploy this version. -- `$PAYARA/bin/asadmin deploy dataverse-5.8.war` +- `$PAYARA/bin/asadmin deploy dataverse-5.9.war` 5\. Restart payara @@ -89,7 +74,7 @@ In the following commands we assume that Payara 5 is installed in `/usr/local/pa 6a\. Replace `schema.xml` with the base version included in this release. ``` - wget https://github.com/IQSS/dataverse/releases/download/v5.8/schema.xml + wget https://github.com/IQSS/dataverse/releases/download/v5.9/schema.xml cp schema.xml /usr/local/solr/solr-8.8.1/server/solr/collection1/conf ``` @@ -100,7 +85,7 @@ For installations that are not using any Custom Metadata Blocks, **you can skip Use the script provided in the release to add the custom fields to the base `schema.xml` installed in the previous step. ``` - wget https://github.com/IQSS/dataverse/releases/download/v5.8/update-fields.sh + wget https://github.com/IQSS/dataverse/releases/download/v5.9/update-fields.sh chmod +x update-fields.sh curl "http://localhost:8080/api/admin/index/solr/schema" | ./update-fields.sh /usr/local/solr/solr-8.8.1/server/solr/collection1/conf/schema.xml ``` @@ -109,4 +94,4 @@ Use the script provided in the release to add the custom fields to the base `sch 7\. Restart Solr -Usually `service solr stop; service solr start`, but may be different on your system. See the [Installation Guide](https://guides.dataverse.org/en/5.8/installation/prerequisites.html#solr-init-script) for more details. +Usually `service solr stop; service solr start`, but may be different on your system. See the [Installation Guide](https://guides.dataverse.org/en/5.9/installation/prerequisites.html#solr-init-script) for more details. From eb19f05d256c480b3905b050a0817e09d0545aac Mon Sep 17 00:00:00 2001 From: Danny Brooke Date: Mon, 6 Dec 2021 09:01:40 -0500 Subject: [PATCH 03/18] adding in content from individual files --- doc/release-notes/5.9-release-notes.md | 62 ++++++++++++++++++- doc/release-notes/6937-range.md | 10 --- doc/release-notes/7804-dv-speedup.md | 7 --- ...linking-ORCID-profile-from-metadata-tab.md | 3 - doc/release-notes/8018-invalid-characters.md | 1 - .../8097-indexall-performance.md | 6 -- .../8155-external-metadata-validation.md | 7 --- doc/release-notes/8160-thumbnail-limit.md | 1 - .../8174-new-managefilepermissions.md | 3 - .../8235-auxiliaryfileAPIenhancements.md | 14 ----- doc/release-notes/8261-geojson.md | 3 - 11 files changed, 59 insertions(+), 58 deletions(-) delete mode 100644 doc/release-notes/6937-range.md delete mode 100644 doc/release-notes/7804-dv-speedup.md delete mode 100644 doc/release-notes/7978-linking-ORCID-profile-from-metadata-tab.md delete mode 100644 doc/release-notes/8018-invalid-characters.md delete mode 100644 doc/release-notes/8097-indexall-performance.md delete mode 100644 doc/release-notes/8155-external-metadata-validation.md delete mode 100644 doc/release-notes/8160-thumbnail-limit.md delete mode 100644 doc/release-notes/8174-new-managefilepermissions.md delete mode 100644 doc/release-notes/8235-auxiliaryfileAPIenhancements.md delete mode 100644 doc/release-notes/8261-geojson.md diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index 0549877f516..bc78f409f5f 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -4,24 +4,76 @@ This release brings new features, enhancements, and bug fixes to the Dataverse S ## Release Highlights -### XXX +### Dataverse Collection Backend Optimizations + +Optimizations to one of the most used pages in the application. + +### Support for HTTP "Range" Header for Partial File Downloads + +Dataverse now supports the HTTP "Range" header, which allows users to download parts of a file. Here are some examples: + +- `bytes=0-9` gets the first 10 bytes. +- `bytes=10-19` gets 10 bytes from the middle. +- `bytes=-10` gets the last 10 bytes. +- `bytes=9-` gets all bytes except the first 10. + +Only a single range is supported. For more information, see the [Data Access API](https://guides.dataverse.org/en/5.9/api/dataaccess.html) section of the API Guide. + +### Support for optional external metadata validation scripts + +This enables an installation administrator to provide custom scripts for additional metadata validation when datasets are being published and/or when Dataverse collections are being published or modified. Harvard Dataverse Repository has been using this mechanism to combat content that violates our Terms of Use. All the validation or verification logic is defined in these external scripts, thus making it possible for an installation to add checks custom-tailored to their needs. + +Please note that only the metadata are subject to these validation checks (not the content of any uploaded files!). + +For more information, see the [Database Settings](https://guides.dataverse.org/en/5.9/installation/config.html) section of the Guide. + +### Displaying author's identifier as link + +In the dataset page's metadata tab the author's identifier is displayed as a clickable link, which points to the profile page in the external service (ORCID, VIAF etc.), given that the identifier scheme provides a resolvable landing page. If the identifier does not match the expected scheme, a link is not shown. + +### Auxiliary File API Enhancements + +This release includes updates to the Auxiliary File API: +- Auxiliary files can now also be associated with non-tabular files +- Improved error reporting +- The API will block attempts to create a duplicate auxiliary file +- Delete and list-by-original calls have been added +- Bug fix: correct checksum recorded for aux file + +Please note that the auxiliary files feature is experimental and is designed to support integration with tools from the [OpenDP Project](https://opendp.org). If the API endpoints are not needed they can be blocked. ## Major Use Cases and Infrastructure Enhancements Newly-supported major use cases in this release include: - XXX (Issue #XXX, PR #XXX) +- DP Cases +- .geojson files are now correctly identified as GeoJSON files rather than "unknown". ## Notes for Dataverse Installation Administrators -### Mitigate Solr Schema Management Problems +### Indexing performance on datasets with large numbers of files + +We discovered that whenever a full reindexing needs to be performed, datasets with large numbers of files take exceptionally long time to index (for example, in the IQSS repository it takes several hours for a dataset that has 25,000 files). In situations where the Solr index needs to be erased and rebuilt from scratch (such as a Solr version upgrade, or a corrupt index, etc.) this can significantly delay the repopulation of the search catalog. + +We are still investigating the reasons behind this performance issue. For now, even though some improvements have been made, a dataset with thousands of files is still going to take a long time to index. But we've made a simple change to the reindexing process, to index any such datasets at the very end of the batch, after all the datasets with fewer files have been reindexed. This does not improve the overall reindexing time, but will repopulate the bulk of the search index much faster for the users of the installation. + +### New ManageFilePermissions Permission + +Dataverse can now support a use case in which a Admin or Curator would like to delegate the ability to grant access to restricted files to other users. This can be implemented by creating a custom role (e.g. DownloadApprover) that has the new ManageFilePermissions permission. This release introduces the new permission ( and adjusts the existing standard Admin and Curator roles so they continue to have the ability to grant file download requrests). -XXX +### Thumbnail Defaults + +New defaults have been added for when to create thumbnails for images and PDFs. The default is 3 MB for images, 1 MB for PDFs. Previously, there was no default. ## New JVM Options and DB Settings - :Settingname XXX +## Notes for Developers and Integrators + +Mention new section of the Dev Guide that covers writing more efficient front-end code + ## Complete List of Changes For the complete list of code changes in this release, see the [5.9 Milestone](https://github.com/IQSS/dataverse/milestone/100?closed=1) in Github. @@ -34,6 +86,10 @@ If this is a new installation, please see our [Installation Guide](https://guide ## Upgrade Instructions +- Reindex Solr and reexport all exports after deployment because invalid characters are removed in the database by a SQL upgrade script. +- A full re-index is needed to update the facets, as well as optionally running the "redetect file type" API on existing GeoJSON files. + + 0\. These instructions assume that you've already successfully upgraded from Dataverse Software 4.x to Dataverse Software 5 following the instructions in the [Dataverse Software 5 Release Notes](https://github.com/IQSS/dataverse/releases/tag/v5.0). After upgrading from the 4.x series to 5.0, you should progress through the other 5.x releases before attempting the upgrade to 5.9. If you are running Payara as a non-root user (and you should be!), **remember not to execute the commands below as root**. Use `sudo` to change to that user first. For example, `sudo -i -u dataverse` if `dataverse` is your dedicated application user. diff --git a/doc/release-notes/6937-range.md b/doc/release-notes/6937-range.md deleted file mode 100644 index 94a12ae704c..00000000000 --- a/doc/release-notes/6937-range.md +++ /dev/null @@ -1,10 +0,0 @@ -### Support for HTTP "Range" Header for Partial File Downloads - -Dataverse now supports the HTTP "Range" header, which allows users to download parts of a file. Here are some examples: - -- `bytes=0-9` gets the first 10 bytes. -- `bytes=10-19` gets 10 bytes from the middle. -- `bytes=-10` gets the last 10 bytes. -- `bytes=9-` gets all bytes except the first 10. - -Only a single range is supported. For more information, see the [Data Access API](https://guides.dataverse.org/en/5.9/api/dataaccess.html) section of the API Guide. diff --git a/doc/release-notes/7804-dv-speedup.md b/doc/release-notes/7804-dv-speedup.md deleted file mode 100644 index 3d096138ad8..00000000000 --- a/doc/release-notes/7804-dv-speedup.md +++ /dev/null @@ -1,7 +0,0 @@ -### Dataverse Collection Backend Optimizations - -Optimizations to one of the most used pages in the application. - -## Notes for Tool Developers and Integrators - -Mention new section of the Dev Guide that covers writing more efficient front-end code \ No newline at end of file diff --git a/doc/release-notes/7978-linking-ORCID-profile-from-metadata-tab.md b/doc/release-notes/7978-linking-ORCID-profile-from-metadata-tab.md deleted file mode 100644 index 62c8b308073..00000000000 --- a/doc/release-notes/7978-linking-ORCID-profile-from-metadata-tab.md +++ /dev/null @@ -1,3 +0,0 @@ -### Displaying author's identifier as link - -In the dataset page's metadata tab the author's identifier is displayed as a clickable link, which points to the profile page in the external service (ORCID, VIAF etc.), given that the identifier scheme provides a resolvable landing page. If the identifier does not match the expected scheme, a link is not shown. diff --git a/doc/release-notes/8018-invalid-characters.md b/doc/release-notes/8018-invalid-characters.md deleted file mode 100644 index 4b1d011eb02..00000000000 --- a/doc/release-notes/8018-invalid-characters.md +++ /dev/null @@ -1 +0,0 @@ -Reindex Solr and reexport all exports after deployment because invalid characters are removed in the database by a SQL upgrade script. diff --git a/doc/release-notes/8097-indexall-performance.md b/doc/release-notes/8097-indexall-performance.md deleted file mode 100644 index b027c21cbce..00000000000 --- a/doc/release-notes/8097-indexall-performance.md +++ /dev/null @@ -1,6 +0,0 @@ -### Indexing performance on datasets with large numbers of files - -We discovered that whenever a full reindexing needs to be performed, datasets with large numbers of files take exceptionally long time to index (for example, in the IQSS repository it takes several hours for a dataset that has 25,000 files). In situations where the Solr index needs to be erased and rebuilt from scratch (such as a Solr version upgrade, or a corrupt index, etc.) this can significantly delay the repopulation of the search catalog. - -We are still investigating the reasons behind this performance issue. For now, even though some improvements have been made, a dataset with thousands of files is still going to take a long time to index. But we've made a simple change to the reindexing process, to index any such datasets at the very end of the batch, after all the datasets with fewer files have been reindexed. This does not improve the overall reindexing time, but will repopulate the bulk of the search index much faster for the users of the installation. - diff --git a/doc/release-notes/8155-external-metadata-validation.md b/doc/release-notes/8155-external-metadata-validation.md deleted file mode 100644 index 0be23d5330c..00000000000 --- a/doc/release-notes/8155-external-metadata-validation.md +++ /dev/null @@ -1,7 +0,0 @@ -### Support for optional external metadata validation scripts - -This enables an installation administrator to provide custom scripts for additional metadata validation when datasets are being published and/or when Dataverse collections are being published or modified. Harvard Dataverse Repository has been using this mechanism to combat content that violates our Terms of Use. All the validation or verification logic is defined in these external scripts, thus making it possible for an installation to add checks custom-tailored to their needs. - -Please note that only the metadata are subject to these validation checks (not the content of any uploaded files!). - -For more information, see the [Database Settings](https://guides.dataverse.org/en/5.9/installation/config.html) section of the Guide. diff --git a/doc/release-notes/8160-thumbnail-limit.md b/doc/release-notes/8160-thumbnail-limit.md deleted file mode 100644 index 3e1df665e78..00000000000 --- a/doc/release-notes/8160-thumbnail-limit.md +++ /dev/null @@ -1 +0,0 @@ -New defaults have been added for when to create thumbnails for images and PDFs. The default is 3 MB for images, 1 MB for PDFs. Previously, there was no default. diff --git a/doc/release-notes/8174-new-managefilepermissions.md b/doc/release-notes/8174-new-managefilepermissions.md deleted file mode 100644 index 81ef0821d69..00000000000 --- a/doc/release-notes/8174-new-managefilepermissions.md +++ /dev/null @@ -1,3 +0,0 @@ -## New ManageFilePermissions Permission - -Dataverse can now support a use case in which a Admin or Curator would like to delegate the ability to grant access to restricted files to other users. This can be implemented by creating a custom role (e.g. DownloadApprover) that has the new ManageFilePermissions permission. This release introduces the new permission ( and adjusts the existing standard Admin and Curator roles so they continue to have the ability to grant file download requrests). \ No newline at end of file diff --git a/doc/release-notes/8235-auxiliaryfileAPIenhancements.md b/doc/release-notes/8235-auxiliaryfileAPIenhancements.md deleted file mode 100644 index b83f10be948..00000000000 --- a/doc/release-notes/8235-auxiliaryfileAPIenhancements.md +++ /dev/null @@ -1,14 +0,0 @@ -### Auxiliary File API Enhancements - -This release includes updates to the Auxiliary File API: -- Auxiliary files can now also be associated with non-tabular files -- Improved error reporting -- The API will block attempts to create a duplicate auxiliary file -- Delete and list-by-original calls have been added -- Bug fix: correct checksum recorded for aux file - -Please note that the auxiliary files feature is experimental and is designed to support integration with tools from the [OpenDP Project](https://opendp.org). If the API endpoints are not needed they can be blocked. - -### Major Use Cases - -(note for release time - expand on the items above, as use cases) \ No newline at end of file diff --git a/doc/release-notes/8261-geojson.md b/doc/release-notes/8261-geojson.md deleted file mode 100644 index ad3b2380ed9..00000000000 --- a/doc/release-notes/8261-geojson.md +++ /dev/null @@ -1,3 +0,0 @@ -.geojson files are now correctly identified as GeoJSON files rather than "unknown". - -A full re-index is needed to update the facets, as well as optionally running the "redetect file type" API on existing GeoJSON files. From 005c1872ecb1d672b3d6861d6fbce5dee6831403 Mon Sep 17 00:00:00 2001 From: Danny Brooke Date: Mon, 6 Dec 2021 17:29:07 -0500 Subject: [PATCH 04/18] resolving warnings --- doc/release-notes/5.9-release-notes.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index bc78f409f5f..1c7934f0338 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -23,7 +23,7 @@ Only a single range is supported. For more information, see the [Data Access API This enables an installation administrator to provide custom scripts for additional metadata validation when datasets are being published and/or when Dataverse collections are being published or modified. Harvard Dataverse Repository has been using this mechanism to combat content that violates our Terms of Use. All the validation or verification logic is defined in these external scripts, thus making it possible for an installation to add checks custom-tailored to their needs. -Please note that only the metadata are subject to these validation checks (not the content of any uploaded files!). +Please note that only the metadata are subject to these validation checks (not the content of any uploaded files!). For more information, see the [Database Settings](https://guides.dataverse.org/en/5.9/installation/config.html) section of the Guide. @@ -34,6 +34,7 @@ In the dataset page's metadata tab the author's identifier is displayed as a cli ### Auxiliary File API Enhancements This release includes updates to the Auxiliary File API: + - Auxiliary files can now also be associated with non-tabular files - Improved error reporting - The API will block attempts to create a duplicate auxiliary file @@ -54,7 +55,7 @@ Newly-supported major use cases in this release include: ### Indexing performance on datasets with large numbers of files -We discovered that whenever a full reindexing needs to be performed, datasets with large numbers of files take exceptionally long time to index (for example, in the IQSS repository it takes several hours for a dataset that has 25,000 files). In situations where the Solr index needs to be erased and rebuilt from scratch (such as a Solr version upgrade, or a corrupt index, etc.) this can significantly delay the repopulation of the search catalog. +We discovered that whenever a full reindexing needs to be performed, datasets with large numbers of files take exceptionally long time to index (for example, in the IQSS repository it takes several hours for a dataset that has 25,000 files). In situations where the Solr index needs to be erased and rebuilt from scratch (such as a Solr version upgrade, or a corrupt index, etc.) this can significantly delay the repopulation of the search catalog. We are still investigating the reasons behind this performance issue. For now, even though some improvements have been made, a dataset with thousands of files is still going to take a long time to index. But we've made a simple change to the reindexing process, to index any such datasets at the very end of the batch, after all the datasets with fewer files have been reindexed. This does not improve the overall reindexing time, but will repopulate the bulk of the search index much faster for the users of the installation. @@ -89,7 +90,6 @@ If this is a new installation, please see our [Installation Guide](https://guide - Reindex Solr and reexport all exports after deployment because invalid characters are removed in the database by a SQL upgrade script. - A full re-index is needed to update the facets, as well as optionally running the "redetect file type" API on existing GeoJSON files. - 0\. These instructions assume that you've already successfully upgraded from Dataverse Software 4.x to Dataverse Software 5 following the instructions in the [Dataverse Software 5 Release Notes](https://github.com/IQSS/dataverse/releases/tag/v5.0). After upgrading from the 4.x series to 5.0, you should progress through the other 5.x releases before attempting the upgrade to 5.9. If you are running Payara as a non-root user (and you should be!), **remember not to execute the commands below as root**. Use `sudo` to change to that user first. For example, `sudo -i -u dataverse` if `dataverse` is your dedicated application user. From eacaaa6270285ee355d7aca36b86f7b9a4aee775 Mon Sep 17 00:00:00 2001 From: Danny Brooke Date: Mon, 6 Dec 2021 19:30:44 -0500 Subject: [PATCH 05/18] gathering of the use cases --- doc/release-notes/5.9-release-notes.md | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index 1c7934f0338..51c69020d30 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -47,9 +47,21 @@ Please note that the auxiliary files feature is experimental and is designed to Newly-supported major use cases in this release include: -- XXX (Issue #XXX, PR #XXX) -- DP Cases -- .geojson files are now correctly identified as GeoJSON files rather than "unknown". +- Dataverse Page Speedup (Issue #7804, PR #8143) +- Improved accessibility of buttons on the Dataset and File pages (Issue #8247, PR #8257) +- Spam Checker (Issue #8155, PR #8245) +- Remove invalid characters (Issue #8018, PR #8242) +- Auxiliary files can now also be associated with non-tabular files (Issue #8235, PR #8237) +- The API will block attempts to create a duplicate auxiliary file (Issue #8235, PR #8237) +- Delete Aux ((Issue #8235, PR #8237) +- List-by-origin calls have been added (Issue #8235, PR #8237) +- .geojson files are now correctly identified as GeoJSON files rather than "unknown" (Issue #8261, PR #8262) +- Additional info around role deletion in ActionLogRecord (Issue #2912, PR #8211) +- New ManageFilePermissions Permission (Issue #8109, PR #8174) +- Improve Index Speed (Issue #8097, PR #8152) +- HTTP Range (Issue #6397, PR #8087) +- Account Migration (PR #7916) + ## Notes for Dataverse Installation Administrators From 95fa1800c99b6062fd16cb9d8391bb2c3971645b Mon Sep 17 00:00:00 2001 From: Danny Brooke Date: Tue, 7 Dec 2021 09:26:54 -0500 Subject: [PATCH 06/18] updates --- doc/release-notes/5.9-release-notes.md | 49 +++++++++++++------------- 1 file changed, 25 insertions(+), 24 deletions(-) diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index 51c69020d30..ed92726a9f3 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -4,9 +4,9 @@ This release brings new features, enhancements, and bug fixes to the Dataverse S ## Release Highlights -### Dataverse Collection Backend Optimizations +### Dataverse Collection Optimizations -Optimizations to one of the most used pages in the application. +The Dataverse Collection page, which also serves as the search page and the homepage in most Dataverse installations, has been optimized, with a specific focus on number of queries for each page load. The optimizations will be more noticable on Dataverse installations with higher traffic. ### Support for HTTP "Range" Header for Partial File Downloads @@ -19,27 +19,28 @@ Dataverse now supports the HTTP "Range" header, which allows users to download p Only a single range is supported. For more information, see the [Data Access API](https://guides.dataverse.org/en/5.9/api/dataaccess.html) section of the API Guide. -### Support for optional external metadata validation scripts +### Support for Optional External Metadata Validation Scripts -This enables an installation administrator to provide custom scripts for additional metadata validation when datasets are being published and/or when Dataverse collections are being published or modified. Harvard Dataverse Repository has been using this mechanism to combat content that violates our Terms of Use. All the validation or verification logic is defined in these external scripts, thus making it possible for an installation to add checks custom-tailored to their needs. +The Dataverse software now allows an installation administrator to provide custom scripts for additional metadata validation when datasets are being published and/or when Dataverse collections are being published or modified. The Harvard Dataverse Repository has been using this mechanism to combat content that violates our Terms of Use, specifically spam content. All the validation or verification logic is defined in these external scripts, thus making it possible for an installation to add checks custom-tailored to their needs. -Please note that only the metadata are subject to these validation checks (not the content of any uploaded files!). +Please note that only the metadata are subject to these validation checks. This does not check the content of any uploaded files. For more information, see the [Database Settings](https://guides.dataverse.org/en/5.9/installation/config.html) section of the Guide. -### Displaying author's identifier as link +### Displaying Author's Identifier as Link -In the dataset page's metadata tab the author's identifier is displayed as a clickable link, which points to the profile page in the external service (ORCID, VIAF etc.), given that the identifier scheme provides a resolvable landing page. If the identifier does not match the expected scheme, a link is not shown. +In the dataset page's metadata tab the author's identifier is now displayed as a clickable link, which points to the profile page in the external service (ORCID, VIAF etc.) in cases where the identifier scheme provides a resolvable landing page. If the identifier does not match the expected scheme, a link is not shown. ### Auxiliary File API Enhancements -This release includes updates to the Auxiliary File API: +This release includes updates to the Auxiliary File API. These updates include: - Auxiliary files can now also be associated with non-tabular files +- Auxiliary files can now be deleted +- Duplicate Auxiliary files can no longer be created +- A new API has been added to list Auxiliary files by their origin - Improved error reporting -- The API will block attempts to create a duplicate auxiliary file -- Delete and list-by-original calls have been added -- Bug fix: correct checksum recorded for aux file +- A bugfix involving checksums for Auxiliary files Please note that the auxiliary files feature is experimental and is designed to support integration with tools from the [OpenDP Project](https://opendp.org). If the API endpoints are not needed they can be blocked. @@ -47,21 +48,21 @@ Please note that the auxiliary files feature is experimental and is designed to Newly-supported major use cases in this release include: -- Dataverse Page Speedup (Issue #7804, PR #8143) -- Improved accessibility of buttons on the Dataset and File pages (Issue #8247, PR #8257) -- Spam Checker (Issue #8155, PR #8245) -- Remove invalid characters (Issue #8018, PR #8242) -- Auxiliary files can now also be associated with non-tabular files (Issue #8235, PR #8237) -- The API will block attempts to create a duplicate auxiliary file (Issue #8235, PR #8237) -- Delete Aux ((Issue #8235, PR #8237) -- List-by-origin calls have been added (Issue #8235, PR #8237) -- .geojson files are now correctly identified as GeoJSON files rather than "unknown" (Issue #8261, PR #8262) -- Additional info around role deletion in ActionLogRecord (Issue #2912, PR #8211) -- New ManageFilePermissions Permission (Issue #8109, PR #8174) -- Improve Index Speed (Issue #8097, PR #8152) +- A Dataverse installation administrator can now set up metadata validation for datasets and Dataverse collections, allowing for publish-time and create-time checks for all content. (Issue #8155, PR #8245) +- The Dataverse collection page has been optimized, resulting in quicker load times on one of the most common pages in the application (Issue #7804, PR #8143) +- The indexing process has been updated so that datasets with fewer files and indexed first, resulting in fewer failures and making it easier to identify problematically-large datasets. (Issue #8097, PR #8152) +- Users will no longer be able to create metadata records with problematic special characters, which would later require Dataverse installation administrator intervention and a database change (Issue #8018, PR #8242) +- Users will now be able to associate Auxiliary files with non-tabular files (Issue #8235, PR #8237) +- Users will no longer be able to create duplicate Auxiliary files (Issue #8235, PR #8237) +- Users will be able to delete Auxiliary files ((Issue #8235, PR #8237) +- Users can retrieve a list of Auxiliary files based on their origin (Issue #8235, PR #8237) +- The Dataverse software will now appropriately recognize files with the .geojson extension as GeoJSON files rather than "unknown" (Issue #8261, PR #8262) +- A Dataverse installation administrator can now retrieve more information about role deletion from the ActionLogRecord (Issue #2912, PR #8211) +- Users will be able to use a new role to allow a user to respond to file download requests without also giving them the power to manage the dataset (Issue #8109, PR #8174) + - HTTP Range (Issue #6397, PR #8087) - Account Migration (PR #7916) - +- Improved accessibility of buttons on the Dataset and File pages (Issue #8247, PR #8257) ## Notes for Dataverse Installation Administrators From 2202de92e95761e68834bab2261f5cec77615987 Mon Sep 17 00:00:00 2001 From: Danny Brooke Date: Tue, 7 Dec 2021 09:54:12 -0500 Subject: [PATCH 07/18] more updates --- doc/release-notes/5.9-release-notes.md | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index ed92726a9f3..fe59a272076 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -59,26 +59,25 @@ Newly-supported major use cases in this release include: - The Dataverse software will now appropriately recognize files with the .geojson extension as GeoJSON files rather than "unknown" (Issue #8261, PR #8262) - A Dataverse installation administrator can now retrieve more information about role deletion from the ActionLogRecord (Issue #2912, PR #8211) - Users will be able to use a new role to allow a user to respond to file download requests without also giving them the power to manage the dataset (Issue #8109, PR #8174) - -- HTTP Range (Issue #6397, PR #8087) -- Account Migration (PR #7916) +- Users will now be able to specify a certain byte range in their downloads via API, allowing for downloads of file parts. (Issue #6397, PR #8087) +- Users will no longer be forced to update their passwords when moving from Dataverse 3.x to Dataverse 4.x (PR #7916) - Improved accessibility of buttons on the Dataset and File pages (Issue #8247, PR #8257) ## Notes for Dataverse Installation Administrators -### Indexing performance on datasets with large numbers of files +### Indexing Performance on Datasets with Large Numbers of Files -We discovered that whenever a full reindexing needs to be performed, datasets with large numbers of files take exceptionally long time to index (for example, in the IQSS repository it takes several hours for a dataset that has 25,000 files). In situations where the Solr index needs to be erased and rebuilt from scratch (such as a Solr version upgrade, or a corrupt index, etc.) this can significantly delay the repopulation of the search catalog. +We discovered that whenever a full reindexing needs to be performed, datasets with large numbers of files take an exceptionally long time to index. For example, in the Harvard Dataverse Repository, it takes several hours for a dataset that has 25,000 files. In situations where the Solr index needs to be erased and rebuilt from scratch (such as a Solr version upgrade, or a corrupt index, etc.) this can significantly delay the repopulation of the search catalog. -We are still investigating the reasons behind this performance issue. For now, even though some improvements have been made, a dataset with thousands of files is still going to take a long time to index. But we've made a simple change to the reindexing process, to index any such datasets at the very end of the batch, after all the datasets with fewer files have been reindexed. This does not improve the overall reindexing time, but will repopulate the bulk of the search index much faster for the users of the installation. +We are still investigating the reasons behind this performance issue. For now, even though some improvements have been made, a dataset with thousands of files is still going to take a long time to index. In this release, we've made a simple change to the reindexing process, to index any such datasets at the very end of the batch, after all the datasets with fewer files have been reindexed. This does not improve the overall reindexing time, but will repopulate the bulk of the search index much faster for the users of the installation. ### New ManageFilePermissions Permission -Dataverse can now support a use case in which a Admin or Curator would like to delegate the ability to grant access to restricted files to other users. This can be implemented by creating a custom role (e.g. DownloadApprover) that has the new ManageFilePermissions permission. This release introduces the new permission ( and adjusts the existing standard Admin and Curator roles so they continue to have the ability to grant file download requrests). +Dataverse can now support a use case in which a Admin or Curator would like to delegate the ability to grant access to restricted files to other users. This can be implemented by creating a custom role (e.g. DownloadApprover) that has the new ManageFilePermissions permission. This release introduces the new permission, and a Flyway script adjusts the existing Admin and Curator roles so they continue to have the ability to grant file download requrests. ### Thumbnail Defaults -New defaults have been added for when to create thumbnails for images and PDFs. The default is 3 MB for images, 1 MB for PDFs. Previously, there was no default. +New defaults have been added for when to create thumbnails for images and PDFs. The default is 3 MB for images, 1 MB for PDFs. Previously, there was no default and the thumbnail creation process for large files had the potential to cause system stability issues. ## New JVM Options and DB Settings From e68951b1ed0693aa8fdcb31be86e648b068fdf24 Mon Sep 17 00:00:00 2001 From: Danny Brooke Date: Tue, 7 Dec 2021 10:14:38 -0500 Subject: [PATCH 08/18] settings and dev guide updates. --- doc/release-notes/5.9-release-notes.md | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index fe59a272076..d4e0ff65b06 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -81,11 +81,23 @@ New defaults have been added for when to create thumbnails for images and PDFs. ## New JVM Options and DB Settings -- :Settingname XXX +The following DB settings allow configuration of the external metadata validator: + +- :DataverseMetadataValidatorScript +- :DataverseMetadataPublishValidationFailureMsg +- :DataverseMetadataUpdateValidationFailureMsg +- :DatasetMetadataValidatorScript +- :DatasetMetadataValidationFailureMsg +- :ExternalValidationAdminOverride + +See the [Database Settings](https://guides.dataverse.org/en/5.9/installation/config.html) section of the Guides for more information. ## Notes for Developers and Integrators -Mention new section of the Dev Guide that covers writing more efficient front-end code +Two sections of the Developer Guide have been updated: + +- Instructions on how to sync a PR in progress with develop have been added in the version control section +- Guidance on avoiding ineffeciencies in JSF render logic has been added to the "Tips" section ## Complete List of Changes From d110dcb20b272506de345774c39abf247f6bd518 Mon Sep 17 00:00:00 2001 From: Danny Brooke Date: Tue, 7 Dec 2021 10:26:08 -0500 Subject: [PATCH 09/18] updates to step by step instructions before code review --- doc/release-notes/5.9-release-notes.md | 32 +++++--------------------- 1 file changed, 6 insertions(+), 26 deletions(-) diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index d4e0ff65b06..1eb4ea71dd7 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -111,9 +111,6 @@ If this is a new installation, please see our [Installation Guide](https://guide ## Upgrade Instructions -- Reindex Solr and reexport all exports after deployment because invalid characters are removed in the database by a SQL upgrade script. -- A full re-index is needed to update the facets, as well as optionally running the "redetect file type" API on existing GeoJSON files. - 0\. These instructions assume that you've already successfully upgraded from Dataverse Software 4.x to Dataverse Software 5 following the instructions in the [Dataverse Software 5 Release Notes](https://github.com/IQSS/dataverse/releases/tag/v5.0). After upgrading from the 4.x series to 5.0, you should progress through the other 5.x releases before attempting the upgrade to 5.9. If you are running Payara as a non-root user (and you should be!), **remember not to execute the commands below as root**. Use `sudo` to change to that user first. For example, `sudo -i -u dataverse` if `dataverse` is your dedicated application user. @@ -147,31 +144,14 @@ In the following commands we assume that Payara 5 is installed in `/usr/local/pa - `service payara stop` - `service payara start` -6\. Update Solr schema.xml. - -`/usr/local/solr/solr-8.8.1/server/solr/collection1/conf` is used in the examples below as the location of your Solr schema. Please adapt it to the correct location, if different in your installation. Use `find / -name schema.xml` if in doubt. - -6a\. Replace `schema.xml` with the base version included in this release. - -``` - wget https://github.com/IQSS/dataverse/releases/download/v5.9/schema.xml - cp schema.xml /usr/local/solr/solr-8.8.1/server/solr/collection1/conf -``` - -For installations that are not using any Custom Metadata Blocks, **you can skip the next step**. - -6b\. For installations with Custom Metadata Blocks +6\. Kick off full reindex -Use the script provided in the release to add the custom fields to the base `schema.xml` installed in the previous step. +Following the directions in the [Guides](http://guides.dataverse.org/en/5.9/admin/solr-search-index.html) -``` - wget https://github.com/IQSS/dataverse/releases/download/v5.9/update-fields.sh - chmod +x update-fields.sh - curl "http://localhost:8080/api/admin/index/solr/schema" | ./update-fields.sh /usr/local/solr/solr-8.8.1/server/solr/collection1/conf/schema.xml -``` +7\. Run ReExportall to update JSON Exports -(Note that the curl command above calls the admin api on `localhost` to obtain the list of the custom fields. In the unlikely case that you are running the main Dataverse Application and Solr on different servers, generate the `schema.xml` on the application node, then copy it onto the Solr server) +Following the directions in the [Guides] (http://guides.dataverse.org/en/5.9/admin/metadataexport.html?highlight=export#batch-exports-through-the-api) -7\. Restart Solr +## Additional Release Steps -Usually `service solr stop; service solr start`, but may be different on your system. See the [Installation Guide](https://guides.dataverse.org/en/5.9/installation/prerequisites.html#solr-init-script) for more details. +1\. Redetect GeoJSON files to update the type from "Unknown" to GeoJSON, following the directions in the [Guides] (https://guides.dataverse.org/en/5.9/api/native-api.html#redetect-file-type) \ No newline at end of file From b3eb322bee7e10d2852f558b5ea0a5ea9619a439 Mon Sep 17 00:00:00 2001 From: Danny Brooke Date: Tue, 7 Dec 2021 11:04:41 -0500 Subject: [PATCH 10/18] feedback from @donsizemore --- doc/release-notes/5.9-release-notes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index 1eb4ea71dd7..fc86e586610 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -6,7 +6,7 @@ This release brings new features, enhancements, and bug fixes to the Dataverse S ### Dataverse Collection Optimizations -The Dataverse Collection page, which also serves as the search page and the homepage in most Dataverse installations, has been optimized, with a specific focus on number of queries for each page load. The optimizations will be more noticable on Dataverse installations with higher traffic. +The Dataverse Collection page, which also serves as the search page and the homepage in most Dataverse installations, has been optimized, with a specific focus on reducing the number of queries for each page load. These optimizations will be more noticable on Dataverse installations with higher traffic. ### Support for HTTP "Range" Header for Partial File Downloads From dbb7423c889bd74ee4a172a188237238f9978bcb Mon Sep 17 00:00:00 2001 From: Danny Brooke Date: Tue, 7 Dec 2021 14:44:48 -0500 Subject: [PATCH 11/18] updates from code review thanks @pdurbin --- doc/release-notes/5.9-release-notes.md | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index fc86e586610..4c97b9e3763 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -25,7 +25,7 @@ The Dataverse software now allows an installation administrator to provide custo Please note that only the metadata are subject to these validation checks. This does not check the content of any uploaded files. -For more information, see the [Database Settings](https://guides.dataverse.org/en/5.9/installation/config.html) section of the Guide. +For more information, see the [Database Settings](https://guides.dataverse.org/en/5.9/installation/config.html) section of the Guide. The new settings are listed below, in the "New JVM Options and DB Settings" section of these release notes. ### Displaying Author's Identifier as Link @@ -48,18 +48,19 @@ Please note that the auxiliary files feature is experimental and is designed to Newly-supported major use cases in this release include: -- A Dataverse installation administrator can now set up metadata validation for datasets and Dataverse collections, allowing for publish-time and create-time checks for all content. (Issue #8155, PR #8245) - The Dataverse collection page has been optimized, resulting in quicker load times on one of the most common pages in the application (Issue #7804, PR #8143) -- The indexing process has been updated so that datasets with fewer files and indexed first, resulting in fewer failures and making it easier to identify problematically-large datasets. (Issue #8097, PR #8152) -- Users will no longer be able to create metadata records with problematic special characters, which would later require Dataverse installation administrator intervention and a database change (Issue #8018, PR #8242) +- Users will now be able to specify a certain byte range in their downloads via API, allowing for downloads of file parts. (Issue #6397, PR #8087) +- A Dataverse installation administrator can now set up metadata validation for datasets and Dataverse collections, allowing for publish-time and create-time checks for all content. (Issue #8155, PR #8245) +- Users will be provided with clickable links to authors' ORCIDs and other IDs in the dataset metadata (Issue #7978, PR #7979) - Users will now be able to associate Auxiliary files with non-tabular files (Issue #8235, PR #8237) - Users will no longer be able to create duplicate Auxiliary files (Issue #8235, PR #8237) - Users will be able to delete Auxiliary files ((Issue #8235, PR #8237) - Users can retrieve a list of Auxiliary files based on their origin (Issue #8235, PR #8237) +- The indexing process has been updated so that datasets with fewer files and indexed first, resulting in fewer failures and making it easier to identify problematically-large datasets. (Issue #8097, PR #8152) +- Users will no longer be able to create metadata records with problematic special characters, which would later require Dataverse installation administrator intervention and a database change (Issue #8018, PR #8242) - The Dataverse software will now appropriately recognize files with the .geojson extension as GeoJSON files rather than "unknown" (Issue #8261, PR #8262) - A Dataverse installation administrator can now retrieve more information about role deletion from the ActionLogRecord (Issue #2912, PR #8211) - Users will be able to use a new role to allow a user to respond to file download requests without also giving them the power to manage the dataset (Issue #8109, PR #8174) -- Users will now be able to specify a certain byte range in their downloads via API, allowing for downloads of file parts. (Issue #6397, PR #8087) - Users will no longer be forced to update their passwords when moving from Dataverse 3.x to Dataverse 4.x (PR #7916) - Improved accessibility of buttons on the Dataset and File pages (Issue #8247, PR #8257) @@ -150,8 +151,8 @@ Following the directions in the [Guides](http://guides.dataverse.org/en/5.9/admi 7\. Run ReExportall to update JSON Exports -Following the directions in the [Guides] (http://guides.dataverse.org/en/5.9/admin/metadataexport.html?highlight=export#batch-exports-through-the-api) +Following the directions in the [Guides](http://guides.dataverse.org/en/5.9/admin/metadataexport.html?highlight=export#batch-exports-through-the-api) ## Additional Release Steps -1\. Redetect GeoJSON files to update the type from "Unknown" to GeoJSON, following the directions in the [Guides] (https://guides.dataverse.org/en/5.9/api/native-api.html#redetect-file-type) \ No newline at end of file +1\. Redetect GeoJSON files to update the type from "Unknown" to GeoJSON, following the directions in the [Guides](https://guides.dataverse.org/en/5.9/api/native-api.html#redetect-file-type) From 452b7e6591ed6f039a49bf05aa4b489506395c54 Mon Sep 17 00:00:00 2001 From: Danny Brooke Date: Tue, 7 Dec 2021 15:17:03 -0500 Subject: [PATCH 12/18] adding analytics fix per @qqmyers --- doc/release-notes/5.9-release-notes.md | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index 4c97b9e3763..f841e56dad1 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -72,6 +72,12 @@ We discovered that whenever a full reindexing needs to be performed, datasets wi We are still investigating the reasons behind this performance issue. For now, even though some improvements have been made, a dataset with thousands of files is still going to take a long time to index. In this release, we've made a simple change to the reindexing process, to index any such datasets at the very end of the batch, after all the datasets with fewer files have been reindexed. This does not improve the overall reindexing time, but will repopulate the bulk of the search index much faster for the users of the installation. +### Custom Analytics Code Changes + +You should update your custom analytics code to capture a bug fix related to tracking within the dataset files table. This release restores that tracking. + +For more information, see the documentation and sample analytics code snippet provided in [Installation Guide](http://guides.dataverse.org/en/5.9/installation/config.html#web-analytics-code). This update can be used on any version 5.4+. + ### New ManageFilePermissions Permission Dataverse can now support a use case in which a Admin or Curator would like to delegate the ability to grant access to restricted files to other users. This can be implemented by creating a custom role (e.g. DownloadApprover) that has the new ManageFilePermissions permission. This release introduces the new permission, and a Flyway script adjusts the existing Admin and Curator roles so they continue to have the ability to grant file download requrests. @@ -147,12 +153,14 @@ In the following commands we assume that Payara 5 is installed in `/usr/local/pa 6\. Kick off full reindex -Following the directions in the [Guides](http://guides.dataverse.org/en/5.9/admin/solr-search-index.html) +Following the directions in the [Admin Guide](http://guides.dataverse.org/en/5.9/admin/solr-search-index.html) 7\. Run ReExportall to update JSON Exports -Following the directions in the [Guides](http://guides.dataverse.org/en/5.9/admin/metadataexport.html?highlight=export#batch-exports-through-the-api) +Following the directions in the [Admin Guide](http://guides.dataverse.org/en/5.9/admin/metadataexport.html?highlight=export#batch-exports-through-the-api) ## Additional Release Steps -1\. Redetect GeoJSON files to update the type from "Unknown" to GeoJSON, following the directions in the [Guides](https://guides.dataverse.org/en/5.9/api/native-api.html#redetect-file-type) +1\. Redetect GeoJSON files to update the type from "Unknown" to GeoJSON, following the directions in the [API Guide](https://guides.dataverse.org/en/5.9/api/native-api.html#redetect-file-type) + +2\. Update custom analytics code per the [Installation Guide](http://guides.dataverse.org/en/5.9/installation/config.html#web-analytics-code). From 5aede5e99d1525b203dbb1872b3817855b205849 Mon Sep 17 00:00:00 2001 From: Danny Brooke Date: Tue, 7 Dec 2021 18:21:54 -0500 Subject: [PATCH 13/18] adding in content for last PR merged (8282) --- doc/release-notes/5.9-release-notes.md | 6 ++++-- doc/release-notes/8241-mime.md | 1 - 2 files changed, 4 insertions(+), 3 deletions(-) delete mode 100644 doc/release-notes/8241-mime.md diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index f841e56dad1..a3f3f1b045e 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -39,10 +39,11 @@ This release includes updates to the Auxiliary File API. These updates include: - Auxiliary files can now be deleted - Duplicate Auxiliary files can no longer be created - A new API has been added to list Auxiliary files by their origin +- Some auxiliary were being saved with the wrong content type (MIME type) but now the user can supply the content type on upload, overriding the type that would otherwise be assigned - Improved error reporting - A bugfix involving checksums for Auxiliary files -Please note that the auxiliary files feature is experimental and is designed to support integration with tools from the [OpenDP Project](https://opendp.org). If the API endpoints are not needed they can be blocked. +Please note that the Auxiliary files feature is experimental and is designed to support integration with tools from the [OpenDP Project](https://opendp.org). If the API endpoints are not needed they can be blocked. ## Major Use Cases and Infrastructure Enhancements @@ -54,8 +55,9 @@ Newly-supported major use cases in this release include: - Users will be provided with clickable links to authors' ORCIDs and other IDs in the dataset metadata (Issue #7978, PR #7979) - Users will now be able to associate Auxiliary files with non-tabular files (Issue #8235, PR #8237) - Users will no longer be able to create duplicate Auxiliary files (Issue #8235, PR #8237) -- Users will be able to delete Auxiliary files ((Issue #8235, PR #8237) +- Users will be able to delete Auxiliary files (Issue #8235, PR #8237) - Users can retrieve a list of Auxiliary files based on their origin (Issue #8235, PR #8237) +- Users will be able to supply the content type of Auxiliary files on upload (Issue #8241, PR #8282) - The indexing process has been updated so that datasets with fewer files and indexed first, resulting in fewer failures and making it easier to identify problematically-large datasets. (Issue #8097, PR #8152) - Users will no longer be able to create metadata records with problematic special characters, which would later require Dataverse installation administrator intervention and a database change (Issue #8018, PR #8242) - The Dataverse software will now appropriately recognize files with the .geojson extension as GeoJSON files rather than "unknown" (Issue #8261, PR #8262) diff --git a/doc/release-notes/8241-mime.md b/doc/release-notes/8241-mime.md deleted file mode 100644 index 9fcd90a38f5..00000000000 --- a/doc/release-notes/8241-mime.md +++ /dev/null @@ -1 +0,0 @@ -Some auxiliary were being saved with the wrong content type (MIME type) but now the user can supply the content type on upload, overriding the type that would otherwise be assigned. From d65296c58b863ae2d1c66282700bba9e1e71f323 Mon Sep 17 00:00:00 2001 From: Danny Brooke Date: Wed, 8 Dec 2021 17:02:42 -0500 Subject: [PATCH 14/18] review feedback --- doc/release-notes/5.9-release-notes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index a3f3f1b045e..930aedee482 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -86,7 +86,7 @@ Dataverse can now support a use case in which a Admin or Curator would like to d ### Thumbnail Defaults -New defaults have been added for when to create thumbnails for images and PDFs. The default is 3 MB for images, 1 MB for PDFs. Previously, there was no default and the thumbnail creation process for large files had the potential to cause system stability issues. +New defaults have been added for when to create thumbnails for images and PDFs. The default is 3 MB for images, 1 MB for PDFs. Previously, there was no default and the thumbnail creation process for large files had the potential to cause system stability issues. This change will not affect your installation if you already have your own limits specified (via JVM options in domain.xml). ## New JVM Options and DB Settings From b498b119f401076f140bb793a58b0b0090998443 Mon Sep 17 00:00:00 2001 From: Danny Brooke Date: Wed, 8 Dec 2021 17:26:44 -0500 Subject: [PATCH 15/18] updates from review --- doc/release-notes/5.9-release-notes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index 930aedee482..ac9b37d6efa 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -86,7 +86,7 @@ Dataverse can now support a use case in which a Admin or Curator would like to d ### Thumbnail Defaults -New defaults have been added for when to create thumbnails for images and PDFs. The default is 3 MB for images, 1 MB for PDFs. Previously, there was no default and the thumbnail creation process for large files had the potential to cause system stability issues. This change will not affect your installation if you already have your own limits specified (via JVM options in domain.xml). +New *default* values have been added for the JVM settings `dataverse.dataAccess.thumbnail.image.limit` and `dataverse.dataAccess.pdf.image.limit`, of 3MB and 1MB respectively. This means that, *unless specified otherwise* by the JVM settings already in your domain configuration, the application will skip attempting to generate thumbnails for image files and PDFs that are above these size limits. ## New JVM Options and DB Settings From 4d0b0f689afec83371cde138bb806bb6114dc69c Mon Sep 17 00:00:00 2001 From: landreev Date: Wed, 8 Dec 2021 17:36:22 -0500 Subject: [PATCH 16/18] Update 5.9-release-notes.md --- doc/release-notes/5.9-release-notes.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index ac9b37d6efa..31978bf8e2c 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -86,7 +86,8 @@ Dataverse can now support a use case in which a Admin or Curator would like to d ### Thumbnail Defaults -New *default* values have been added for the JVM settings `dataverse.dataAccess.thumbnail.image.limit` and `dataverse.dataAccess.pdf.image.limit`, of 3MB and 1MB respectively. This means that, *unless specified otherwise* by the JVM settings already in your domain configuration, the application will skip attempting to generate thumbnails for image files and PDFs that are above these size limits. +New *default* values have been added for the JVM settings `dataverse.dataAccess.thumbnail.image.limit` and `dataverse.dataAccess.thumbnail.pdf.limit`, of 3MB and 1MB respectively. This means that, *unless specified otherwise* by the JVM settings already in your domain configuration, the application will skip attempting to generate thumbnails for image files and PDFs that are above these size limits. +In previous versions, if these limits were not explicitly set, the application would try to create thumbnails for files of unlimited size. Which would occasionally cause problems with very large images. ## New JVM Options and DB Settings From 121460e6abd80367410e33f4775725b8ca737573 Mon Sep 17 00:00:00 2001 From: landreev Date: Wed, 8 Dec 2021 17:36:53 -0500 Subject: [PATCH 17/18] Update 5.9-release-notes.md --- doc/release-notes/5.9-release-notes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index 31978bf8e2c..24971768e38 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -4,7 +4,7 @@ This release brings new features, enhancements, and bug fixes to the Dataverse S ## Release Highlights -### Dataverse Collection Optimizations +### Dataverse Collection Page Optimizations The Dataverse Collection page, which also serves as the search page and the homepage in most Dataverse installations, has been optimized, with a specific focus on reducing the number of queries for each page load. These optimizations will be more noticable on Dataverse installations with higher traffic. From 1e480a69f566db9a14f52e7857fceadbb09d8533 Mon Sep 17 00:00:00 2001 From: Danny Brooke Date: Wed, 8 Dec 2021 18:48:49 -0500 Subject: [PATCH 18/18] review feedback, thanks @landreev --- doc/release-notes/5.9-release-notes.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/doc/release-notes/5.9-release-notes.md b/doc/release-notes/5.9-release-notes.md index 24971768e38..06005396908 100644 --- a/doc/release-notes/5.9-release-notes.md +++ b/doc/release-notes/5.9-release-notes.md @@ -87,7 +87,7 @@ Dataverse can now support a use case in which a Admin or Curator would like to d ### Thumbnail Defaults New *default* values have been added for the JVM settings `dataverse.dataAccess.thumbnail.image.limit` and `dataverse.dataAccess.thumbnail.pdf.limit`, of 3MB and 1MB respectively. This means that, *unless specified otherwise* by the JVM settings already in your domain configuration, the application will skip attempting to generate thumbnails for image files and PDFs that are above these size limits. -In previous versions, if these limits were not explicitly set, the application would try to create thumbnails for files of unlimited size. Which would occasionally cause problems with very large images. +In previous versions, if these limits were not explicitly set, the application would try to create thumbnails for files of unlimited size. Which would occasionally cause problems with very large images. ## New JVM Options and DB Settings @@ -154,16 +154,18 @@ In the following commands we assume that Payara 5 is installed in `/usr/local/pa - `service payara stop` - `service payara start` -6\. Kick off full reindex - -Following the directions in the [Admin Guide](http://guides.dataverse.org/en/5.9/admin/solr-search-index.html) - -7\. Run ReExportall to update JSON Exports +6\. Run ReExportall to update JSON Exports Following the directions in the [Admin Guide](http://guides.dataverse.org/en/5.9/admin/metadataexport.html?highlight=export#batch-exports-through-the-api) ## Additional Release Steps +(for installations collecting web analytics) + +1\. Update custom analytics code per the [Installation Guide](http://guides.dataverse.org/en/5.9/installation/config.html#web-analytics-code). + +(for installations with GeoJSON files) + 1\. Redetect GeoJSON files to update the type from "Unknown" to GeoJSON, following the directions in the [API Guide](https://guides.dataverse.org/en/5.9/api/native-api.html#redetect-file-type) -2\. Update custom analytics code per the [Installation Guide](http://guides.dataverse.org/en/5.9/installation/config.html#web-analytics-code). +2\. Kick off full reindex following the directions in the [Admin Guide](http://guides.dataverse.org/en/5.9/admin/solr-search-index.html)