diff --git a/doc/sphinx-guides/source/admin/features.md b/doc/sphinx-guides/source/admin/features.md new file mode 100644 index 00000000000..d485d5f58f5 --- /dev/null +++ b/doc/sphinx-guides/source/admin/features.md @@ -0,0 +1,339 @@ +# Features + +An overview of Dataverse features can be found at . This is a more comprehensive list. + +```{contents} Contents: +:local: +:depth: 3 +``` + + +## AI + +### AI Tools + +A number of AI tools integrate with Dataverse. +{doc}`More information.` + +### Model Context Protocol (MCP) + +Model Context Protocol (MCP) is a standard for AI Agents to communicate with tools and services. +{ref}`More information.` + +## Access and download + +### Login via Shibboleth + +Single Sign On (SSO) using your institution's credentials. +{doc}`More information.` + +### Login via ORCID, Google, GitHub, or Microsoft + +Log in using popular OAuth2 providers. +{doc}`More information.` + +### Login via OpenID Connect (OIDC) + +Log in using your institution's identity provider or a third party. +{doc}`More information.` + +### Versioning + +History of changes to datasets and files are preserved. +{doc}`More information.` + +### File previews + +A preview is available for text, tabular, image, audio, video, and geospatial files. +{ref}`More information.` + +### Preview and analysis of tabular files + +Data Explorer allows for searching, charting and cross tabulation analysis +{ref}`More information.` + +### Guestbook + +Optionally collect data about who is downloading the files from your datasets. +{ref}`More information.` + +### File download in R and TSV format + +Proprietary tabular formats are converted into RData and TSV. +{doc}`More information.` + +### Faceted search + +Facets are data driven and customizable per collection. +{doc}`More information.` + +## Administration + +### Quotas + +For number of files, etc. +{doc}`More information.` + +### Usage statistics and metrics + +Download counters, support for Make Data Count. +{doc}`More information.` + +### Private URL + +Create a URL for reviewers to view an unpublished (and optionally anonymized) dataset. +{ref}`More information.` + +### Notifications + +In app and email notifications for access requests, requests for review, etc. +{ref}`More information.` + +### User management + +Dashboard for common user-related tasks. +{doc}`More information.` + +### Curation status labels + +Let curators mark datasets with a status label customized to your needs. +{ref}`More information.<:AllowedCurationLabels>` + +## Customization + +### Internationalization + +The Dataverse software has been translated into multiple languages. +{ref}`More information.` + +### Customization of collections + +Each personal or organizational collection can be customized and branded. +{ref}`More information.` + +### Widgets + +Embed listings of data in external websites. +{ref}`More information.` + +### Branding + +Your installation can be branded with a custom homepage, header, footer, CSS, etc. +{ref}`More information.` + +## FAIR data publication + +### TK Labels + +Integrate with the Local Contexts platform, enabling the use of Traditional Knowledge and Biocultural Labels, and Notices. +{doc}`More information.` + +### Support for FAIR Data Principles + +Findable, Accessible, Interoperable, Reusable. +[More information.](https://web.archive.org/web/20191206043258/https://scholar.harvard.edu/mercecrosas/presentations/fair-guiding-principles-implementation-dataverse) +### Prepublication Review Support + +Datasets start as drafts and can be submitted for review before publication. +{ref}`More information.` + +## File management + +### Retention Periods + +Make files inaccessible once the retention period set has passed. +{ref}`More information.` + +### Restricted files + +Control who can download files and choose whether or not to enable a "Request Access" button. +{ref}`More information.` + +### Embargo + +Make files inaccessible until an embargo end date. +{ref}`More information.` + +### File hierarchy + +Users are able to control dataset file hierarchy and directory structure. +{doc}`More information.` + +### Fixity checks for files + +MD5, SHA-1, SHA-256, SHA-512, UNF. +{ref}`More information.<:FileFixityChecksumAlgorithm>` + +### Backend storage on S3 or Swift + +Choose between filesystem or object storage, configurable per collection and per dataset. +{doc}`More information.` + +### Direct upload and download for S3 + +After a permission check, files can pass freely and directly between a client computer and S3. +{doc}`More information.` + +### Pull header metadata from Astronomy (FITS) files + +Dataset metadata prepopulated from FITS file metadata. +{ref}`More information.` + +### Auxiliary files for data files + +Each data file can have any number of auxiliary files for documentation or other purposes (experimental). +{doc}`More information.` + +## Geospatial + +### Metadata Extraction from Geospatial Files + +Populate the bounding box from NetCDF and HDF5 files. +{ref}`More information.` + +### Geospatial Search API + +Pass `geo_point` and `geo_radius` to find datasets based on their bounding box. +{doc}`More information.` + +### Geospatial File Preview + +GeoJSON, GeoTIFF, and Shapefiles can be previewed as a map. +{ref}`More information.` + +### Geospatial Metadata Fields + +There is a dedicated geospatial metadata block. +{ref}`More information.` + +## Integrations + +### Galaxy Integration + +Import files directly from Dataverse into Galaxy as well as publish datasets containing artifacts (Histories, datasets, etc.) from Galaxy to Dataverse. +{ref}`More information.` + +### Handles + +Handles are a Persistent ID (PID) that are an alternative to DOIs. +{ref}`More information.` + +### Globus + +Upload from and download to Dataverse using Globus endpoints. +{ref}`More information.` + +### iRODS + +Pull data from an iRODS instance to a Dataverse dataset. +{ref}`More information.` + +### DMPTool Integration Via RSpace + +A Data Management Plan (DMP) can be uploaded to RSpace and updated with the DOI of a Dataverse dataset. +{ref}`More information.` + +### DataCite integration + +DOIs are reserved, and when datasets are published, their metadata is published to DataCite. +{doc}`More information.` + +### External tools + +Enable additional features not built in to the Dataverse software. +{doc}`More information.` + +### Dropbox integration + +Upload files stored on Dropbox. +{doc}`More information.` + +### GitHub integration + +A GitHub Action is available to upload files from GitHub to a dataset. +{doc}`More information.` + +### Integration with Jupyter notebooks + +Datasets can be opened in Binder to run code in Jupyter notebooks, RStudio, and other computation environments. They can also be previewed in Dataverse itself. +{ref}`More information.` + +## Interoperability + +### Signposting + +Enable easier machine access to datasets by adding linkset in a Dataverse header. +{ref}`More information.` + +### Harvest from DataCite + +Harvest metadata directly from DataCite to Dataverse using OAI-PMH. +{ref}`More information.` + +### Croissant + +Export metadata as linked data following the Croissant ontology. +{ref}`More information.` + +### RO-Crate + +Export dataset metadata as an ro-crate.json. +{ref}`More information.` + +### OAI-PMH (Harvesting) + +Gather and expose metadata from and to other systems using standardized metadata formats: Dublin Core, Data Document Initiative (DDI), OpenAIRE, etc. +{doc}`More information.` + +### APIs for interoperability and custom integrations + +Search API, Data Deposit (SWORD) API, Data Access API, Metrics API, Migration API, etc. +{doc}`More information.` + +### API client libraries + +Interact with Dataverse APIs from Python, R, Javascript, Java, and Ruby +{doc}`More information.` + +### Schema.org JSON-LD + +Used by Google Dataset Search and other services for discoverability. +{ref}`More information.` + +### External vocabulary + +Let users pick from external vocabularies (provided via API/SKOSMOS) when filling in metadata. +{ref}`More information.` + +### Export data in BagIt format + +For preservation, bags can be sent to the local filesystem, Duraclound, and Google Cloud. +{ref}`More information.` + +## Reusability + +### Data citation for datasets and files + +EndNote XML, RIS, BibTeX, or 1000+ CSL formats at the dataset or file level. +{doc}`More information.` + +### Multiple licenses + +CC0 by default but add as many standard licenses as you like or create your own. +{ref}`More information.` + +### Custom terms of use + +Custom terms of use can be used in place of a license or disabled by an administrator. +{ref}`More information.` + +### Post-publication automation (workflows) + +Allow publication of a dataset to kick off external processes and integrations. +{doc}`More information.` + +### Provenance + +Upload standard W3C provenance files or enter free text instead. +{ref}`More information.` + diff --git a/doc/sphinx-guides/source/admin/index.rst b/doc/sphinx-guides/source/admin/index.rst index 4d2d5c22fc2..0f7a87bdded 100755 --- a/doc/sphinx-guides/source/admin/index.rst +++ b/doc/sphinx-guides/source/admin/index.rst @@ -13,6 +13,7 @@ This guide documents the functionality only available to superusers (such as "da .. toctree:: :maxdepth: 2 + features dashboard external-tools discoverability diff --git a/doc/sphinx-guides/source/admin/integrations.rst b/doc/sphinx-guides/source/admin/integrations.rst index bb981c75ace..65afdce1e56 100644 --- a/doc/sphinx-guides/source/admin/integrations.rst +++ b/doc/sphinx-guides/source/admin/integrations.rst @@ -38,6 +38,8 @@ Researcher can configure OSF itself to deposit to your Dataverse installation by In addition to the method mentioned above, the :ref:`integrations-dashboard` also enables a pull of data from OSF to a dataset. +.. _rspace: + RSpace ++++++ @@ -45,6 +47,8 @@ RSpace is an affordable and secure enterprise grade electronic lab notebook (ELN For instructions on depositing data from RSpace to your Dataverse installation, your researchers can visit https://www.researchspace.com/help-and-support-resources/dataverse-integration/ +As shown in a `video `_, a Data Management Plan (DPM) can be added into RSpace and the research records and associated data can then be sent to Dataverse. Dataverse generates a Persistent Identifier (PID, often a DOI) for the dataset, and RSpace automatically puts the PID link under "Research Outputs" in the DPM. + Open Journal Systems (OJS) and OPS ++++++++++++++++++++++++++++++++++ @@ -86,6 +90,8 @@ GitLab is an open source Git repository and platform that provides free open and The :ref:`integrations-dashboard` enables a pull of data from GitLab to a dataset in Dataverse. +.. _irods: + iRODS +++++ @@ -152,6 +158,13 @@ Open OnDemand `Open OnDemand `_ is a web frontend to High Performance Computing (HPC) resources. Through a system called `OnDemand Loop `_, developed at IQSS, researchers can create datasets in Dataverse and upload files to them from their Open OnDemand installation. They can also :ref:`download ` files from Dataverse. +.. _galaxy-integration: + +Galaxy +++++++ + +Import files directly from Dataverse into `Galaxy `_ as well as publish datasets containing artifacts (Histories, datasets, etc.) from Galaxy to Dataverse. For details, see https://github.com/galaxyproject/galaxy/pull/19367 + Embedding Data on Websites -------------------------- diff --git a/doc/sphinx-guides/source/quickstart/what-is-dataverse.md b/doc/sphinx-guides/source/quickstart/what-is-dataverse.md index 6f86473bada..ceb3da0a6ad 100644 --- a/doc/sphinx-guides/source/quickstart/what-is-dataverse.md +++ b/doc/sphinx-guides/source/quickstart/what-is-dataverse.md @@ -10,6 +10,7 @@ A Dataverse repository can host one or more Dataverse collections, which organiz - Data files - Documentation or code +(core-capabilities)= ## Core Capabilities ### 📤 Upload, manage, publish and download data files. @@ -37,4 +38,4 @@ A Dataverse repository can host one or more Dataverse collections, which organiz - Compare versions with the detailed version change overview on dataset-level. ### ✨More features -The Dataverse project is continuously evolving. For an overview of capabilities, visit the [features list](https://dataverse.org/software-features). +The Dataverse project is continuously evolving. For an overview of capabilities, see {doc}`/admin/features` in the Admin Guide. diff --git a/doc/sphinx-guides/source/user/dataset-management.rst b/doc/sphinx-guides/source/user/dataset-management.rst index 22e72a6a210..7b71a8ac66b 100755 --- a/doc/sphinx-guides/source/user/dataset-management.rst +++ b/doc/sphinx-guides/source/user/dataset-management.rst @@ -40,7 +40,7 @@ Once a dataset has been published, its metadata can be exported in a variety of Additional formats can be enabled. See :ref:`inventory-of-external-exporters` in the Installation Guide. To highlight a few: - Croissant -- RO-Crate +- RO-Crate: See also https://www.researchobject.org/ro-crate/dataverse Each of these metadata exports contains the metadata of the most recently published version of the dataset. @@ -763,6 +763,8 @@ Once a dataset with embargoed files has been published, no further action is nee As the primary use case of embargoes is to make the existence of data known now, with a promise (to a journal, project team, etc.) that the data itself will become available at a given future date, users cannot change an embargo once a dataset version is published. Dataverse instance administrators do have the ability to correct mistakes and make changes if/when circumstances warrant. +.. _retention-periods: + Retention Periods ================= diff --git a/scripts/issues/11998/tsv2md.py b/scripts/issues/11998/tsv2md.py new file mode 100755 index 00000000000..47c65e51f6c --- /dev/null +++ b/scripts/issues/11998/tsv2md.py @@ -0,0 +1,64 @@ +#!/usr/bin/env python +# +# Download features.tsv like this: +# curl -L "https://docs.google.com/spreadsheets/d/1EIFGAfDfZAboFa3_ShRfgoT6xSDpKohDH2_iCyO5MtA/export?gid=729532473&format=tsv" > features.tsv +# +# The gid above is a specific tab in this spreadsheet: +# https://docs.google.com/spreadsheets/d/1EIFGAfDfZAboFa3_ShRfgoT6xSDpKohDH2_iCyO5MtA/edit?usp=sharing +# +# Here's the README for the spreadsheet: +# https://docs.google.com/document/d/1wqLVoEpnD93Y_wQtA2cQEkAuC0QstC6XVs9XlA7yvbM/edit?usp=sharing +import sys +from optparse import OptionParser +import csv +from itertools import groupby + +parser = OptionParser() +options, args = parser.parse_args() + +if args: + tsv_file = open(args[0]) +else: + tsv_file = sys.stdin + +print("""# Features + +An overview of Dataverse features can be found at . This is a more comprehensive list. + +```{contents} Contents: +:local: +:depth: 3 +``` + +""") + +reader = csv.DictReader(tsv_file, delimiter="\t") +rows = [row for row in reader] +missing = [] +# Sort rows by category +rows.sort(key=lambda x: x["Categories"]) + +# Group by category +for category, group in groupby(rows, key=lambda x: x["Categories"]): + # print('BEGIN') + print("## %s" % category) + print() + for row in group: + title = row["Title"] + description = row["Description"] + url = row["URL"] + dtype = row["DocLinkType"] + target = row["DocLinkTarget"] + print("### %s" % title) + print() + print("%s" % description) + if target == 'url': + print("[More information.](%s)" % (url)) + elif target != '': + print("{%s}`More information.<%s>`" % (dtype, target)) + print() + else: + missing.append(url) +tsv_file.close() +for item in missing: + print(item)