Skip to content

Add API to (relatively) arbitrarily upload auxiliary files #7275

@scolapasta

Description

@scolapasta

The use case, as relevant to the OpenDP implementation: Summary stats with added differential privacy "noise" will be generated by the OpenDP software for a specific Datafile, outside of Dataverse; then deposited into Dataverse, for later retrieval. This issue is for implementing the APIs, for the deposit and later retrieval of these fragments. (There will be multiple physical files in different formats - xml, json; and possibly multiple versions of such diff private metadata sets for a single datafile. On the implementation layer all these individual fragments will be saved as "auxiliary files" using the standard Dataverse StorageIO system).

More generally, this is an API for accepting, storing and serving some metadata that Dataverse cannot produce itself (unlike, for example, image thumbnails or DDI XML describing the data variables that Dataverse knows how to generate from Datafiles). So it must be deposited by an external application before Dataverse can serve it.

There is no direct UI impact. The deposit happens automatically, the remote OpenDP application performs it without the human user being directly involved. (The user does not need to know anything about this API). But once the diff. private metadata has been deposited for a Datafile, Dataverse will be showing extra options (on the dataset and/or file pages) - most likely an extra explore option. In the old PSI demo implementation the user would also see an option to download the deposited diff. private metadata (in Json format) in the normal download pulldown menu. (I'm assuming this issue is just for the API. And any such UI logic will be handled separately). Dataverse does NOT do anything with the actual diff. private metadata deposited (at least as currently defined)! It only knows how to store it for later retrieval and how to serve it on demand; and knows that certain extra things can be done by outside applications for datafiles that have diff. private metadata saved. (for example, Dataverse may know to add an extra redirect link to a diff. private data explore viewer for such a datafile; but all the "exploring" etc. will happen outside the Dataverse application).

(end use case description -- L.A.)

In planning how to handle the auxiliary files for upload and considering other possible future tools (e.g. a mapping tool or a time series graphing tool) that would require auxiliary metadata, it seems clear that designing a flexible system to upload auxiliary files would be useful.

Note we currently already have two types of auxiliary files: those that can already be recreated solely by dataverse and those that cannot. An example fo the first would be thumbnails for images. An example of the second is the original file on a tabular download.

This issue concerns only the latter, as the files that config tools would deposit are not recreate-able (without the use of the tool).

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions