Add HTTP Last-Modified header for TEI resources#396
Conversation
modules/pages.xqm
Outdated
There was a problem hiding this comment.
As per rfc2616 clients SHOULD set If-Modified-Since to a previously received Last-Modified. So using ge (greater or equal) instead of gr (greater) here seems desirable. Otherwise most requests will still get a 200 instead of a 304.
Address @agh2342's comment: > As per rfc2616 clients SHOULD set If-Modified-Since to a previously received Last-Modified. So using ge (greater or equal) instead of gr (greater) here seems desirable. Otherwise most requests will still get a 200 instead of a 304.
|
For some info on how Last-Modified was handled on an earlier incarnation of HSG, see https://markmail.org/message/j5dk2mlho6khxfpv. |
|
|
||
| let $last-modified := | ||
| if (exists($publication-id) and exists($document-id)) then | ||
| pages:last-modified($publication-id, $document-id, $section-id) |
There was a problem hiding this comment.
| pages:last-modified($publication-id, $document-id, $section-id) | |
| pages:last-modified($publication-id, $document-id, $section-id) => format-dateTime("[FNn,*-3], [D01] [MNn,*-3] [Y0001] [H01]:[m01]:[s01] [Z0000]", "en", (), ()) => parse-ietf-date() |
xmldb:last-modified() has millisecond resolution.
HTTP Last-Modified and If-Modified-Since headers have second resolution.
We need to make sure we mangle this the same way for outgoing and incoming HTTP headers and xmldb:last-modified() calls so comparisons work as expected.
There was a problem hiding this comment.
@agh2342 I've applied the change and believe it correctly truncates milliseconds. To confirm the approach:
xquery version "3.1";
let $date := current-dateTime()
return
(
$date,
$date
=> format-dateTime("[Y0001]-[M01]-[D01]T[H01]:[m01]:[s01][Z]")
=> xs:dateTime()
)This returns:
(
xs:dateTime("2022-02-03T12:30:10.395-05:00"),
xs:dateTime("2022-02-03T12:30:10-05:00")
)I deployed the change locally, and the calculation appears to be correct.
|
In a new commit, I added functions to look up and generate the HTTP "Created" header, so that both the created date and last-modified dates are accurate. I did this because in practice the Created header that was automatically generated before was often later than the last-modified date. For example: No logic on our end depends on the Created date, but it was unintuitive, so I thought it was worth applying the same approach we used for generating headers to this piece of information. Now the created date is accurate - so it will always be the same as or earlier than the last-modified date: |
|
@agh2342 Thanks for your reviews and feedback! |
Extend #396 to previous & next navigation buttons on TEI resources
Extend #396 to previous & next navigation buttons on TEI resources
As discussed with @windauer and @agh2342, accurate dependable HTTP Last-Modified headers could aid in our efforts to improve efficiency and performance of our servers. This PR is an initial step toward that goal.
It adds the Last-Modified header to resources on history.state.gov that are generated from TEI sources. Specifically, it defines new
*-last-modifiedentries inconfig.xqm's$config:PUBLICATIONSmap, which thepages:loadtemplating function retrieves when constructing the TEI data needed on a page. (See Update below.) This function then sets a request attribute,hsgshell.last-modified(modeled onhsgshell.error), containing the last-modified date. Then, afterview.xqlfinishes executingtemplates:apply, it checks for the presence of this attribute and, if found, passes the value toapp:set-last-modified, which formats the date according to RFC 1123 (see w3c/qtspecs#36) and sends the formatted date toresponse:set-header("Last-Modified").Besides TEI resources, this PR adds a Last-Modified header for FRUS volumes that we don't have TEI for yet, by using the date of the resource in
/db/apps/frus/bibliography.Other than this one non-TEI case, the PR doesn't handle Last-Modified headers for any other non-TEI resource. These could be added in future PRs.
The functions that look up last modified dates do so purely by looking up the last-modified date of TEI documents using the
xmldb:last-modifiedfunction. It doesn't consider the last-modified date of the primary or any other imported HTML templates or XQuery modules, which could carry significant changes (i.e., site-wide menus, section sidebars, etc.). More nuanced handling could be added in a future PR.The PR only handles direct requests to resources, not the ajax requests with prev/next buttons (i.e.,
frus-ajax.xql).However, since FRUS TEI pages are some of the most taxing pages on the site (and AFAIK crawlers tend to make direct requests for resources rather than activating prev/next buttons), this PR should help us tell whether the addition of dependable Last-Modified headers is worthwhile.
In my local testing, the
Last-Modifiedheader is correctly set and updated as resources are stored in the database, and there are no changes to the functionality of non-TEI pages.Update: With 59008ca, the code now checks for an "If-Modified-Since" header in
pages:loadand, if a 304 is appropriate, it immediately halts further processing of the template—and instead returns a 304.