Skip to content

Deduplicating parellel requests #1

@ngehrsitz

Description

@ngehrsitz

Hi,

I have the use case that an application is using multiple goroutines to make parallel requests that all try to access the same resource on a very slow webserver. With the current implementation of the cache all goroutines make their requests at the same time, thus the cache is still empty and each request will go through to the server. The server then takes several hundred milliseconds to respond and each of the responses then overwrites the value in the cache. This initial surge of duplicate requests is not ideal, but for most use cases the next round of requests would hit the cache and thus not face the same issue.
But the slow webserver I am accessing is serving realtime measurements from my battery storage system, thus caching these values is only useful for very short durations.
I see two options to fix this issue:

  1. Rewrite the application so that the different goroutines do not make the same requests in parallel. Unfortunately this is not trivial since I am not the maintainer and the current logic would need to be severely rewritten to prevent these duplicate requests
  2. Extend this cache so that it uses a https://pkg.go.dev/golang.org/x/sync/singleflight#Group when making requests so that parallel requests for the same resources are only fetched once.

I think that adding deduplication of parallel requests to the cache is probably the better idea since this will benefit all users of the cache and rewriting the logic to prevent these duplicate requests won´t be possible in every use case.
When looking at the code I am not sure where to best put this logic because we need to pass through cacheable and cacheKey from

httpcache/httpcache.go

Lines 767 to 769 in d376920

func (t *Transport) RoundTrip(req *http.Request) (resp *http.Response, err error) {
cacheKey := cacheKeyWithHeaders(req, t.CacheKeyHeaders)
cacheable := (req.Method == methodGET || req.Method == methodHEAD) && req.Header.Get("range") == ""

to
func performRequest(transport http.RoundTripper, req *http.Request, onlyIfCached bool) (*http.Response, error) {

which then also needs to be a method of t *Transport. Do you have a better idea of how to implement this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions