Skip to content

Spike: What work has already been done towards support for controlled vocabularies for metadata fields #8571

@mreekie

Description

@mreekie

This is in support of:

The first step is to figure out what has already been done by the dataverse team and by the community towards this aim. The focus here is on the general area of controlled vocabularies as opposed to specific biomedical vocabularies

For example:

And then to figure out what the next steps are.

Def of done

As completely as is reasonably possible in a 2 week period (sprint):

  • Search out previous related work that has been done by the Harvard Dataverse team

  • Search out previous work done within the community

  • demonstration of what is found to be implemented already in dataverse.

    • This is a configuration item on dataverse
  • Define what's next

    • do we have enough information to describe how to get from here to implementing this feature
    • Or what do we need to do next to get additional information/context.

Aim 2:

Increase support for biomedical and cross-domain metadata standards and controlled vocabularies

One of the useful characteristics of the Dataverse open-source software is its extensive support for metadata standards and additional custom metadata. The standards currently supported include the Data Documentation Initiative (DDI), Dublin Core, DataCite, and Schema.org.

In particular, DDI makes a Dataverse repository interoperable even at the variable/attribute level since it supports variable descriptive and statistical metadata. This allows data exploration and analysis tools to integrate easily with the repository and discovery engines to find variable information.

In this project, we propose to

  1. expand DDI support to include the recently released DDI-Cross-Domain Integration (DDI-CDI) schema,
  2. build on existing support for biomedical-related standards relevant to NIH-funded research cases, following the recommendations from https://fairsharing.org/,
  3. expand descriptive and citation metadata to support funding information and related fields, and
  4. integrate with external services to enable the support of controlled vocabularies for any metadata field, based on standardized, widely used data dictionaries. The HMS Research Data Management group will participate in the development of these standards and vocabularies for biomedical datasets, working directly with research laboratories.

Related documents

Metadata

Metadata

Assignees

No one assigned

    Labels

    Feature: Controlled VocabularyIncludes both Internal and external controlled vocabulariesNIH OTA DCGrant: The Harvard Dataverse repository: A generalist repository integrated with a Data CommonsNIH OTA: 1.2.12 | 1.2.1 | Design and implement integration with controlled vocabularies | 5 prdOwnThis is an it...pm.GREI-d-1.2.1NIH, yr1, aim2, task1: Design and implement integration with controlled voc

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions