-
Notifications
You must be signed in to change notification settings - Fork 535
Description
In a March 16th meeting with the Dataverse team, improvements to the external tools framework were discussed. Specifically, the team described an external tools improvement to provide signed urls instead of passing a Dataverse general API token. In transitioning from Dataverse to an external tool, Dataverse would:
- Dynamically generate signed URLs that grant limited-time access to specific API endpoints. The characteristics of these signed URLs will be defined in an updated external tools manifest/specification.
- Pass these signed urls to the external tool via a POST request to:
- Avoid having any form of authorization token appear in the browser history.
- Minimize potential exposure and malicious use of the signed urls, in places including server logs. (Server log settings for the DPcreator should not expose POST data.)
These signed urls would have the following characteristics:
(a) Limited in scope and linked to a particular user
- The signed urls would be limited in scope. Examples include:
- A url to retrieve a Dataverse Datafile would be limited to a particular file. In addition, the user connected to the url will be checked to make sure they/she/he still have permissions for the operation.
- Similarly, a url to retrieve Schema.org JSON-LD dataset info would be limited to a specific dataset and user permissions would be checked.
- If it’s important to also support rich clients, the signed URL generated by DataVerse could use a registered mime-type. (@joshua-oss note)
(b) Limited in time
- As specified in a prior manifest or equivalent, each signed url passed to the external tool would have to be used within a certain time window. Examples include:
- Retrieving a file within 30 minutes or
- Depositing DP release files within 48 hours.
(c) Encoded with a Dataverse signature
- URL creation note: The URLs are always created in the context of an authenticated user session and the signature links to the users. (@joshua-oss note)
- Each signed url would contain a cryptographic signing token that allows Dataverse to confirm that the URL hasn't been changed. Additional tokens (signed as part of the URL) identify:
- The time the signing was done and
- The user it was done for, allowing Dataverse to securely verify that the URL is being used within proper (a) * scope and (b) time. (* that the user connected to the signed url still has permissions to make the request specified by the URL)
(d) Involve Usage tracking
-
Each time the signed url is used, the Dataverse would log usage information including:
- Timestamp. Datetime, timezone.
- Identifying information. Identifying information related to the request source (such as IP address).
- Success. Whether the request was successful or not. An example of an invalid request may be one where a retrieval failed due to a timeout or missing/corrupt/deleted data.
- Validity. Whether the request was valid or invalid. Examples of invalid requests:
- The signed-url timeframe has expired;
- The user for whom the signed url was created no longer has permissions for the operation
- etc.
This was not specifically addressed in the meeting but in addition to the signed urls, the external tools framework should still include defining and passing along the data currently specified as “queryParameters” and described by the Reserved Words in the external tools documentation. Note: The exception is the “apiGeneralToken”--this would be replaced by the specialized urls.