API for auditing physical files and file metadata#11016
Conversation
This comment has been minimized.
This comment has been minimized.
pdurbin
left a comment
There was a problem hiding this comment.
I took a quick pass through the docs and code. @stevenwinship please let me know what you think.
|
|
||
| Auditing specific Datasets (comma separated list):: | ||
|
|
||
| curl "$SERVER_URL/api/admin/datafiles/auditFiles?DatasetIdentifierList=doi.org/10.5072/FK2/JXYBJS,doi.org/10.7910/DVN/MPU019 |
There was a problem hiding this comment.
Do we use this pattern of passing in the URL form of a PID minus "https://" anywhere else? It seems ok. Can we pass in the normal PIDs (the non-URL form) instead?
There was a problem hiding this comment.
9/21/22 Durbin:
Batch Exports Through the API
...
curl http://localhost:8080/api/admin/metadata/:persistentId/reExportDataset?persistentId=doi:10.5072/FK2/AAA000
There was a problem hiding this comment.
No, it's different... "doi.org/10... vs. doi:10...".
In this PR we. should use the pattern from reExportDataset.
There was a problem hiding this comment.
updated the doc
| "identifier": "DVN/MPU019", | ||
| "persistentURL": "https://doi.org/10.7910/DVN/MPU019", | ||
| "missingFiles": [ | ||
| "s3://dvn-cloud:298910, jihad_metadata_edited.csv" |
There was a problem hiding this comment.
Same. Easier parsing would be nice.
There was a problem hiding this comment.
re-formatted the json output:
"missingFiles": [
{
"StorageIdentifier": "s3://dvn-cloud:298910",
"label": "jihad_metadata_edited.csv"
}
]
There was a problem hiding this comment.
Great! Thanks. Do we need the directoryLabel too?
There was a problem hiding this comment.
added directoryLabel
Co-authored-by: Philip Durbin <philip_durbin@harvard.edu>
Co-authored-by: Philip Durbin <philip_durbin@harvard.edu>
Co-authored-by: Philip Durbin <philip_durbin@harvard.edu>
Co-authored-by: Philip Durbin <philip_durbin@harvard.edu>
Co-authored-by: Philip Durbin <philip_durbin@harvard.edu>
Co-authored-by: Philip Durbin <philip_durbin@harvard.edu>
Co-authored-by: Philip Durbin <philip_durbin@harvard.edu>
Co-authored-by: Philip Durbin <philip_durbin@harvard.edu>
This comment has been minimized.
This comment has been minimized.
4 similar comments
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
1da5daa to
2db26b2
Compare
Co-authored-by: Philip Durbin <philip_durbin@harvard.edu>
Co-authored-by: Philip Durbin <philip_durbin@harvard.edu>
This comment has been minimized.
This comment has been minimized.
4 similar comments
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
2 similar comments
This comment has been minimized.
This comment has been minimized.
|
📦 Pushed preview images as 🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name. |
|
Merging PR - Testing Passed |
What this PR does / why we need it: Find Datasets with missing files so Admins can either delete the file reference or work with authors to re-upload the files.
See: IQSS/dataverse.harvard.edu#220
Which issue(s) this PR closes:
Special notes for your reviewer:
Suggestions on how to test this: Create multiple Datasets with multiple files. If running in Docker locally delete a file from docker-dev-volumes/app/data/store...
call the api and see the missing file listed in the json response.
Other test could include deleting a FileMetadata row from the DB
Request specific Datasets as well as firstId and lastId
Does this PR introduce a user interface change? If mockups are available, please link/include them here: No
Is there a release notes update needed for this change?: Included
Additional documentation:
Preview docs at https://dataverse-guide--11016.org.readthedocs.build/en/11016/api/native-api.html#datafile-audit