Skip to content

Comments

9231 Custom HTTP headers in OAI requests. #9310

Merged
kcondon merged 11 commits intodevelopfrom
9231-extra-headers-oai-requests
Jan 26, 2023
Merged

9231 Custom HTTP headers in OAI requests. #9310
kcondon merged 11 commits intodevelopfrom
9231-extra-headers-oai-requests

Conversation

@landreev
Copy link
Contributor

@landreev landreev commented Jan 23, 2023

What this PR does / why we need it:

This is a feature request from our Elseveir partners. In their case, the remote OAI server requires an extra auth. token, supplied in an http header with every OAI request (something along the lines of x-api-key: xxx-yyy-zzz) in order to serve some content not available to other clients. See the linked issue. So this pr adds the ability to add this header to the harvesting client configuration, both in the GUI and the api.

The changes are fairly straightforward. The only moderately tricky part was it required a PR into the gdcc.xoa library project as well (gdcc/xoai/pull/119; merged).

Which issue(s) this PR closes:

Closes #9231

Special notes for your reviewer:

Please note that the branch in its current state builds with the snapshot (-5.0.0-SNAPSHOT) versions of the XOAI library jars. These jars are automatically built on the xoai side from the 5.0 branch in its present state, i.e., with the changes that were necessary to accommodate this feature, that were made in gdcc/xoai/pull/119 that has been merged recently. While the PR should be fully ready for review and testing, it will be better to switch these jars to a stable minor release version in the pom file before the PR is merged. (The jars will have the exact same code in them, providing the exact same functionality, but in fixed version jars guaranteed to be available on mvn). I'll do this once that release - 5.0.0-RC3 - is tagged, this should happen very shortly on the gdcc.xoai side.

Regarding the Harvesting Clients UI: I initially assumed we would bury this extra header setting on the last page of the create/edit dialogue, under something like "advanced settings" - on account of how very few admins will have a use for this setting. I realized however that it needed to be right there on the first, "Step 1." page - because the remote server may require this token even to be willing to answer the basic "ListSets" and "ListMetadataFormats" calls, that must happen during the setting up of the client, in "Step 2". The field is clearly marked as "optional"; I don't expect it to confuse anyone too much.

Suggestions on how to test this:

All we need to test is that an admin can add some extra header to the configuration, and the Harvester will dutifully add it to every request sent for this client. We don't need to test any scenario where this token actually does something. So a straightforward way to test this would be to:

  • create a client harvesting some small set from another server we can easily reconfigure - for example, my "controlTestSet" on demo.dataverse.org;
  • add some arbitrary header to the configuration; say x-oai-api-key: 111-222-333, which is roughly what it'll look like in the Elseveir use case;
  • add the header above to the access log format on the oai server (demo) side; either on the Apache or Payara level. Confirm that the header is in fact there.

That, and some basic check that the PR isn't breaking anything obvious for existing clients that don't have this custom header feature configured.

ONE MORE THING: The functionality in the PR is ready to be tested. But a minor thing in the pom file still needs to be changed before we merge it - the application in this branch is being built with the dev. version of the XOAI libraries; I'm waiting for it to be tagged as the next point release - "5.0.0-RC3" on the XOAI side, and once that's done, I'll need to change the pom file accordingly. I'm hoping that this will happen very shortly, but please confirm with me before merging - thank you!

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

A minor change to the Harvesting Clients admin GUI, as described above:

Screen Shot 2023-01-23 at 11 49 05 AM

Is there a release notes update needed for this change?:

Additional documentation:

@landreev landreev changed the title 9231 extra headers oai requests 9231 Custom HTTP headers in OAI requests. Jan 23, 2023
@landreev
Copy link
Contributor Author

(I'll look into the Jenkins failure)

@landreev
Copy link
Contributor Author

Hmmm. It failed again, with the exact same symptoms, so it was not a fluke. To the OAI server (demo) the entire exchange looked like everything worked fine. So it's the import on the client side where it's bombing.
I'll figure it out.

… change in behavior in the latest gdcc.xoai - that I knew, but had forgotten about over the weekend. (#9231)
@landreev
Copy link
Contributor Author

Yay.

@scolapasta scolapasta self-assigned this Jan 24, 2023
Copy link
Contributor

@scolapasta scolapasta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one small .pom question

<url>file://${project.basedir}/local_lib</url>
</repository>
<!-- Uncomment when using snapshot releases from Maven Central
<!-- Uncomment when using snapshot releases from Maven Central -->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you confirm that we want this uncommented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this needs to be un-commented out in order to have maven download and build with "snapshot releases", i.e. dev. builds of dependency jars. So I've enabled it temporarily in order to build with the dev. build of the XOAI libraries. As soon as this dev. version is tagged as the next point release, I will reverse it.

@coveralls
Copy link

coveralls commented Jan 24, 2023

Coverage Status

Coverage: 19.97% (-0.02%) from 19.987% when pulling a669aa9 on 9231-extra-headers-oai-requests into f1987ce on develop.

@scolapasta scolapasta removed their assignment Jan 24, 2023
@landreev
Copy link
Contributor Author

(spoke to Oliver this morning. it's not even going to be an -RC3, the xoai package appears to be ready for a "real" 5.0.0 release, and it's coming up!)

@scolapasta scolapasta added this to the 5.13 milestone Jan 26, 2023
@kcondon kcondon self-assigned this Jan 26, 2023
@kcondon
Copy link
Contributor

kcondon commented Jan 26, 2023

Issues found:
X 1. Loading the Dashboard page results in large amount of stack traces in server log saying, Internal Exception: org.postgresql.util.PSQLException: ERROR: column "customhttpheaders" does not exist, though no error appears in UI. Related: cannot create harvest client for the same reason, shows Internal Exception: org.postgresql.util.PSQLException: ERROR: column "customhttpheaders" does not exist when trying to move past first page in create client.
2. When starting a harvest client without an extra header, throws stack traces in server log:
[2023-01-26T19:38:28.378+0000] [Payara 5.2022.3] [WARNING] [AS-EJB-00056] [javax.enterprise.ejb.container] [tid: _ThreadID=211 _ThreadName=__ejb-thread-pool3] [timeMillis: 1674761908378] [levelValue: 900] [[
A system exception occurred during an invocation on EJB ImportServiceBean, method: public edu.harvard.iq.dataverse.Dataset edu.harvard.iq.dataverse.api.imports.ImportServiceBean.doImportHarvestedDataset(edu.harvard.iq.dataverse.engine.command.DataverseRequest,edu.harvard.iq.dataverse.harvest.client.HarvestingClient,java.lang.String,java.lang.String,java.io.File,java.util.Date,java.io.PrintWriter) throws edu.harvard.iq.dataverse.api.imports.ImportException,java.io.IOException]]
harvest seemed to work anyway.
[Kevin] Turns out this issue exists in release version and is how the current client handles errors during harvesting such as the dataset already exists, etc. Leonid has much of this stuff in clean up issue.

@landreev
Copy link
Contributor Author

Issues found:
... ERROR: column "customhttpheaders" does not exist ...

Yeah, the flyway script wasn't checked in. That would indeed do it.
☹️

@kcondon kcondon merged commit 1c2877a into develop Jan 26, 2023
@kcondon kcondon deleted the 9231-extra-headers-oai-requests branch January 26, 2023 20:18
@landreev
Copy link
Contributor Author

OK, I will make a separate PR for the xoai dependency, once the official 5.0.0 is tagged.

@mreekie
Copy link

mreekie commented Jan 26, 2023

sprint kickoff

  • include in QATail for Jan 11, 2023

@landreev
Copy link
Contributor Author

What @kcondon mentions under 2. in "Issues Found" - too much noise/unreadable error messages in the log file when the system is failing to import some records, it is an existing behavior/existing problem, yes. Nevertheless, it is a real problem, we definitely need to provide better diagnostics/more helpful information to the admin there. And that would be one of the things to do under the umbrella of #9327 that I opened yesterday and that I would like to be addressed soon.

@mreekie mreekie added the Size: SprintTail Issues that have made it into Ready for QA or further right label Jan 26, 2023
landreev added a commit that referenced this pull request Jan 30, 2023
Resolved merge conflict with #9310 in OAIServlet (#9309)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Size: SprintTail Issues that have made it into Ready for QA or further right

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: Add the ability to configure a Harvesting Client to add a custom header to OAI calls

5 participants