-
Notifications
You must be signed in to change notification settings - Fork 0
Description
This is in support of:
- an NIH grant "The Harvard Dataverse repository: A generalist repository integrated with a Data Commons",
- Aim 4: Improve harvesting and packaging standards to share metadata and data across repositories,
The first step is to figure out what has already been done by the dataverse team and by the community towards this aim and what still remains to be done.
For example:
And then to prioritize which issues are to be fixed.
Def of done
As completely as is reasonably possible in a 2 week period (sprint):
- Search out previous related issues that are problems with the current implementation. Take an inventory.
- Search out previous work done within the dataverse community as well.
- prioritize which of the issues/PRs that should be moved forward.
We need to keep in mind that to harvest something from a particular source requires that that source be bug free. Identify which sources have which bugs so that bugs for a particular source can be targeted. for example: ICPSR as an example. Zenodo is another.
More information:
There is a lot packaged into Aim 4
- Improved Harvesting via the OAI-PMH standard
- Improved support for Bagit
- Improved support for Signposting
The scope for this issue is Harvesting via the OAI-PMH standard
Aim 4:
Improve harvesting and packaging standards to share metadata and data across repositories
Our proposed project will significantly improve the widely-used Harvard Dataverse repository to better support NIH-funded research.
A critical measure of the GREI program’s success is to standardize the discoverability across generalist repositories.
To help with this, **we propose to improve the existing harvesting functionality in the Dataverse software based on the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) standard, and coordinate with other repository packaging standards to share or move metadata and data. **
Dataverse already supports the Bags as defined by the Research Data Alliance (RDA) Research Data Repository Interoperability Working Group. Here we proposed to improve the support for Bags, test it for NIH-funded datasets, and explore and define the appropriate standard to use to move the metadata and data across generalist repositories. This will help with a sustainable and succession plan - if one repository cannot support anymore a specific dataset, it will allow to easily move the dataset to another repository without losing any information about the dataset.
Additionally we propose to implement Signposting in the Dataverse software. By adding additional http link headers throughout the application, we can more easily support automated metadata and data discovery in the repository, and allow for other applications and services to more accurately and completely represent the content in the Harvard Dataverse repository.
Related documents
Metadata
Metadata
Assignees
Labels
Type
Projects
Status