Skip to content

DOIs for Dataset versions #4499

@adam3smith

Description

@adam3smith

Following up on the thread on the google group:
https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!topic/dataverse-community/34E9foKnxQs

Userstory

As a researcher, I want to be able to cite, using a permanent identifier, a specific version of dataset to avoid any ambiguity and to make the citation machine-actionable.

How this should probably look

Zenodo would be a good template here.

  1. A dataset has one "generic" DOI that always points to the latest version (this would be used if you e.g. just cite the data generically)
  2. A dataset has one DOI per version to allow to point to a specific version

Example

https://doi.org/10.5281/zenodo.1041767 is the generic dataset DOI, always pointing to the most recent version, with 10.5281/zenodo.1188752 being the DOI for the 2nd (current) version and 10.5281/zenodo.1041768 the DOI for the first version

Relationship to other features

File DOIs:

File DOIs are great (and by themselves necessary), but they are not a replacement for dataset version DOIs. Datasets are often made up of multiple files. An analysis script (which itself may be part of the data and thus versioned) may point to multiple files in a dataset. It's not feasible (or desirable) for a researcher to include the DOI for every file used in a citation. They want to point to one single DOI to reference the exact data (made up of multiple files) they've been using. This is equally true for quantiative and qualitative data, btw.

File DOIs of files that are not changed between versions should remain stable (otherwise there's a potential to massively inflate the number of DOIs issued for no good reason)

UNFs

UNFs (at least for tabular data) help ensure that a cited dataset is the one being used, so they avoid using incorrect versions, but they do not function as identifiers, i.e. UNF:6:edx+kB6SY2N3Zt9OsUbp4A== tells me nothing about where to find that dataset.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Feature: DOI & HandleHERMESrelated to @hermes-hmc work on Dataverse codeSize: QueuedPM has called this issue out specifically for sizingType: Featurea feature requestUser Role: DepositorCreates datasets, uploads data, etc.

    Type

    No type

    Projects

    Status

    WIP

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions