Skip to content
This repository was archived by the owner on Apr 20, 2023. It is now read-only.

Fix revalidation of gzipped JSON responses#86

Closed
victorges wants to merge 3 commits into
gregjones:masterfrom
victorges:fix/gzipped-jsons-revalidation
Closed

Fix revalidation of gzipped JSON responses#86
victorges wants to merge 3 commits into
gregjones:masterfrom
victorges:fix/gzipped-jsons-revalidation

Conversation

@victorges
Copy link
Copy Markdown

So investigating on a potential incompatibility between a server using https://github.com/NYTimes/gziphandler and a client using this middleware (or possibly more generically, with servers responding with some Conten-Encoding) that lead to an increased latency in our infra, I came to a weird behavior for this very specific scenario described in the title. I am yet not sure this was the cause of the issue but I think the fix is worth doing anyway.

The scenario is:

  • A server responds a JSON response with content-encoding: gzip and an ETag
  • The client (and the httpcache wrapper) already receives that response already uncompressed by the http.DefaultTransport in case that's the underlying transport.
  • When uncompressing, the http.DefaultTransport also removes any Content-Length that has been set in the response, which results in a response with neither Content-Length or Transfer-Encoding. That scenarios is legal by HTTP definition as well so it's not inherently a bug (a response without either of those headers could arise directly from the server as well)
  • Now this response without Content-Length or Transfer-Encoding is fully read fine on the first time and added to the cache.
  • When httputil.DumpResponse and http.ReadResponse are used in sequence though, the returned response has a different kind of Body that doesn't really return an io.EOF on the last byte from the response, but only after calling Read another time only to get 0 bytes read. Not inherently a bug either.
  • Finally, by using json.Decoder to read the JSON response, it will read from the io.Reader only until the first JSON object is fully read and then stops the read. This means that the last call to Read that would return 0, io.EOF never happens as all the bytes (that result in a valid JSON object) have already been read. Not inherently a bug either (probably desired behavior).
  • This caused the cachingReadCloser to never call the OnEOF function for a response that had been fetched from the cache and revalidated with the server, which could lead of a continuous revalidation of a response that already had a max-age combined with an ETag for example.

TL;DR it's a combination of:

  • A revalidated (either ETag or LastModifed) cached response.
  • Original gzip content-encoding with an http.DefaultTransport for automatic decompression (or apparently any response without Content-Length nor Transfer-Encoding);
  • JSON response that is read with a json.Decoder (or anything that doesn't read fully until io.EOF);

This is a kind of specific scenario, but the fix to the actual code of the cache wrapper turned out pretty neat. It would seem fine for me to afford that small extra complexity to deal with the case.
The test feels a bit complex yet but I think that wouldn't be a problem, let me know of any other ideas to improve it!

@victorges victorges force-pushed the fix/gzipped-jsons-revalidation branch from e07a699 to ca88c01 Compare August 22, 2018 18:09
@victorges victorges force-pushed the fix/gzipped-jsons-revalidation branch from ca88c01 to ee10211 Compare August 22, 2018 18:12
On go:1.6 the error happens even on the original
response from the server so the response is never
added to the cache!
Copy link
Copy Markdown
Owner

@gregjones gregjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough explanation! And the solution does seem nice - just a couple of things to double-check, please?

Comment thread httpcache_test.go
resetTest()
{
req, err := http.NewRequest("GET", s.server.URL, nil)
req, err := http.NewRequest("GET", s.server.URL+"/method", nil)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this part changing?

Comment thread httpcache.go
return n, err
}

var dummyBuf = make([]byte, 1)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me be confident this is safe to be shared?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants