diff --git a/doc/release-notes/6485-multiple-stores.md b/doc/release-notes/6485-multiple-stores.md new file mode 100644 index 00000000000..ea2d224d612 --- /dev/null +++ b/doc/release-notes/6485-multiple-stores.md @@ -0,0 +1,36 @@ +# Multiple Store Support +Dataverse can now be configured to store files in more than one place at the same time (multiple file, s3, and/or swift stores). + +General information about this capability can be found in the Configuration Guide - File Storage section. + +**Upgrade Information:** + +**Existing installations will need to make configuration changes to adopt this version, regardless of whether additional stores are to be added or not.** + +Multistore support requires that each store be assigned a label, id, and type - see the documentation for a more complete explanation. For an existing store, the recommended upgrade path is to assign the store id based on it's type, i.e. a 'file' store would get id 'file', an 's3' store would have the id 's3'. + +With this choice, no manual changes to datafile 'storageidentifier' entries are needed in the database. (If you do not name your existing store using this convention, you will need to edit the database to maintain access to existing files!). + +The following set of commands to change the Glassfish JVM options will adapt an existing file or s3 store for this upgrade: +For a file store: + + ./asadmin create-jvm-options "\-Ddataverse.files.file.type=file" + ./asadmin create-jvm-options "\-Ddataverse.files.file.label=file" + ./asadmin create-jvm-options "\-Ddataverse.files.file.directory=" + +For an s3 store: + + ./asadmin create-jvm-options "\-Ddataverse.files.s3.type=s3" + ./asadmin create-jvm-options "\-Ddataverse.files.s3.label=s3" + ./asadmin delete-jvm-options "-Ddataverse.files.s3-bucket-name=" + ./asadmin create-jvm-options "-Ddataverse.files.s3.bucket-name=" + +Any additional S3 options you have set will need to be replaced as well, following the pattern in the last two lines above - delete the option including a '-' after 's3' and creating the same option with the '-' replaced by a '.', using the same value you currently have configured. + +Once these options are set, restarting the glassfish service is all that is needed to complete the change. + +<<<<<<< HEAD +Note that the "\-Ddataverse.files.directory", if defined, continues to control where temporary files are stored (in the /temp subdir of that directory), independent of the location of any 'file' store defined above. +======= +Note that the "\-Ddataverse.files.directory", if defined, continues to control where temporary files are stored (in the /temp subdir of that directory), independent of the location of any 'file' store defined above. +>>>>>>> branch 'IQSS/6485' of https://github.com/TexasDigitalLibrary/dataverse.git diff --git a/doc/sphinx-guides/source/admin/dataverses-datasets.rst b/doc/sphinx-guides/source/admin/dataverses-datasets.rst index e542dee2d83..a4bea9f53e7 100644 --- a/doc/sphinx-guides/source/admin/dataverses-datasets.rst +++ b/doc/sphinx-guides/source/admin/dataverses-datasets.rst @@ -38,7 +38,27 @@ Add Dataverse RoleAssignments to Child Dataverses Recursively assigns the users and groups having a role(s),that are in the set configured to be inheritable via the :InheritParentRoleAssignments setting, on a specified dataverse to have the same role assignments on all of the dataverses that have been created within it. The response indicates success or failure and lists the individuals/groups and dataverses involved in the update. Only accessible to superusers. :: - curl -H "X-Dataverse-key: $API_TOKEN" http://$SERVER/api/admin/dataverse/$dataverse-alias//addRoleAssignmentsToChildren + curl -H "X-Dataverse-key: $API_TOKEN" http://$SERVER/api/admin/dataverse/$dataverse-alias/addRoleAssignmentsToChildren + +Configure a Dataverse to store all new files in a specific file store +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To direct new files (uploaded when datasets are created or edited) for all datasets in a given dataverse, the store can be specified via the API as shown below, or by editing the 'General Information' for a Dataverse on the Dataverse page. Only accessible to superusers. :: + + curl -H "X-Dataverse-key: $API_TOKEN" -X PUT -d $storageDriverLabel http://$SERVER/api/admin/dataverse/$dataverse-alias/storageDriver + +The current driver can be seen using: + + curl -H "X-Dataverse-key: $API_TOKEN" http://$SERVER/api/admin/dataverse/$dataverse-alias/storageDriver + +and can be reset to the default store with: + + curl -H "X-Dataverse-key: $API_TOKEN" -X DELETE http://$SERVER/api/admin/dataverse/$dataverse-alias/storageDriver + +The available drivers can be listed with: + + curl -H "X-Dataverse-key: $API_TOKEN" http://$SERVER/api/admin/storageDrivers + Datasets -------- diff --git a/doc/sphinx-guides/source/developers/big-data-support.rst b/doc/sphinx-guides/source/developers/big-data-support.rst index 37a794e804e..bb16dd9133d 100644 --- a/doc/sphinx-guides/source/developers/big-data-support.rst +++ b/doc/sphinx-guides/source/developers/big-data-support.rst @@ -18,7 +18,7 @@ Install a DCM Installation instructions can be found at https://github.com/sbgrid/data-capture-module/blob/master/doc/installation.md. Note that shared storage (posix or AWS S3) between Dataverse and your DCM is required. You cannot use a DCM with Swift at this point in time. -.. FIXME: Explain what ``dataverse.files.dcm-s3-bucket-name`` is for and what it has to do with ``dataverse.files.s3-bucket-name``. +.. FIXME: Explain what ``dataverse.files.dcm-s3-bucket-name`` is for and what it has to do with ``dataverse.files.s3.bucket-name``. Once you have installed a DCM, you will need to configure two database settings on the Dataverse side. These settings are documented in the :doc:`/installation/config` section of the Installation Guide: @@ -100,6 +100,7 @@ Optional steps for setting up the S3 Docker DCM Variant ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - Before: the default bucket for DCM to hold files in S3 is named test-dcm. It is coded into `post_upload_s3.bash` (line 30). Change to a different bucket if needed. +- Also Note: With the new support for multiple file store in Dataverse, DCM requires a store with id="s3" and DCM will only work with this store. - Add AWS bucket info to dcmsrv - Add AWS credentials to ``~/.aws/credentials`` @@ -115,6 +116,9 @@ Optional steps for setting up the S3 Docker DCM Variant - ``cd /opt/glassfish4/bin/`` - ``./asadmin delete-jvm-options "\-Ddataverse.files.storage-driver-id=file"`` - ``./asadmin create-jvm-options "\-Ddataverse.files.storage-driver-id=s3"`` + - ``./asadmin create-jvm-options "\-Ddataverse.files.s3.type=s3"`` + - ``./asadmin create-jvm-options "\-Ddataverse.files.s3.label=s3"`` + - Add AWS bucket info to Dataverse - Add AWS credentials to ``~/.aws/credentials`` @@ -132,7 +136,7 @@ Optional steps for setting up the S3 Docker DCM Variant - S3 bucket for Dataverse - - ``/usr/local/glassfish4/glassfish/bin/asadmin create-jvm-options "-Ddataverse.files.s3-bucket-name=iqsstestdcmbucket"`` + - ``/usr/local/glassfish4/glassfish/bin/asadmin create-jvm-options "-Ddataverse.files.s3.bucket-name=iqsstestdcmbucket"`` - S3 bucket for DCM (as Dataverse needs to do the copy over) diff --git a/doc/sphinx-guides/source/installation/config.rst b/doc/sphinx-guides/source/installation/config.rst index 6bbfb788943..d6358e20347 100644 --- a/doc/sphinx-guides/source/installation/config.rst +++ b/doc/sphinx-guides/source/installation/config.rst @@ -215,10 +215,46 @@ As for the "Remote only" authentication mode, it means that: - ``:DefaultAuthProvider`` has been set to use the desired authentication provider - The "builtin" authentication provider has been disabled (:ref:`api-toggle-auth-provider`). Note that disabling the "builtin" authentication provider means that the API endpoint for converting an account from a remote auth provider will not work. Converting directly from one remote authentication provider to another (i.e. from GitHub to Google) is not supported. Conversion from remote is always to "builtin". Then the user initiates a conversion from "builtin" to remote. Note that longer term, the plan is to permit multiple login options to the same Dataverse account per https://github.com/IQSS/dataverse/issues/3487 (so all this talk of conversion will be moot) but for now users can only use a single login option, as explained in the :doc:`/user/account` section of the User Guide. In short, "remote only" might work for you if you only plan to use a single remote authentication provider such that no conversion between remote authentication providers will be necessary. -File Storage: Local Filesystem vs. Swift vs. S3 ------------------------------------------------ +File Storage: Using a Local Filesystem and/or Swift and/or S3 object stores +--------------------------------------------------------------------------- -By default, a Dataverse installation stores data files (files uploaded by end users) on the filesystem at ``/usr/local/glassfish4/glassfish/domains/domain1/files`` but this path can vary based on answers you gave to the installer (see the :ref:`dataverse-installer` section of the Installation Guide) or afterward by reconfiguring the ``dataverse.files.directory`` JVM option described below. +By default, a Dataverse installation stores all data files (files uploaded by end users) on the filesystem at ``/usr/local/glassfish4/glassfish/domains/domain1/files``. This path can vary based on answers you gave to the installer (see the :ref:`dataverse-installer` section of the Installation Guide) or afterward by reconfiguring the ``dataverse.files.directory`` JVM option described below. + +Dataverse can alternately store files in a Swift or S3-compatible object store, and can now be configured to support multiple stores at once. With a multi-store configuration, the location for new files can be controlled on a per-dataverse basis. + +The following sections describe how to set up various types of stores and how to configure for multiple stores. + +Multi-store Basics ++++++++++++++++++ + +To support multiple stores, Dataverse now requires an id, type, and label for each store (even for a single store configuration). These are configured by defining two required jvm options: + +.. code-block:: none + + ./asadmin $ASADMIN_OPTS create-jvm-options "\-Ddataverse.files..type=" + ./asadmin $ASADMIN_OPTS create-jvm-options "\-Ddataverse.files..label=