Skip to content

Expose individual components of citation string in datafile queries #5339

@Xarthisius

Description

@Xarthisius

Currently it's difficult to extract information about a parent dataset when querying only a datafile. All the information is hidden in dataset_citation:

$ curl -s https://dataverse.harvard.edu/api/search?q=entityId:3040230 | jq '.data.items[0]'
{
  "name": "2017-07-31.tab",
  "type": "file",
  "url": "https://dataverse.harvard.edu/api/access/datafile/3040230",
  "file_id": "3040230",
  "published_at": "2017-07-31T22:27:23Z",
  "file_type": "Tab-Delimited",
  "file_content_type": "text/tab-separated-values",
  "size_in_bytes": 12025,
  "md5": "e7dd2f725941b978d45fed3f33ff640c",
  "checksum": {
    "type": "MD5",
    "value": "e7dd2f725941b978d45fed3f33ff640c"
  },
  "unf": "UNF:6:6wGE3C5ragT8A0qkpGaEaQ==",
  "dataset_citation": "Durbin, Philip, 2017, \"Open Source at Harvard\", https://doi.org/10.7910/DVN/TJCLKP, Harvard Dataverse, V2, UNF:6:6wGE3C5ragT8A0qkpGaEaQ== [fileUNF]"
}

Please consider exposing dataset's name, global_id, authors on file level too.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions