Check block cache across multiple rainbow instances

# Problem

At inbrowser.dev (backed by rainbow from ipfs.io gateway, so a general problem in our infra), we see inconsistent page load times across regions, and sometimes across requests within the same region. 

User can get instant response from one instance, and then on subsequent page load, or request, I get stalled page load and timeout, even tho the data exist in cache of one of the other rainbows in the global cluster. We also see inconsistency across subresources on a single page.

## Scope

- Rainbow users running multiple instances should have means of "logically merging their block caches"
- This should be opt-in feature, that requires manual configuration of rainbow operator
- (Open question) Do we want to run bitswap server in rainbow, or HTTP client  to avoid "the unsustainable manual peering trap"?
- We don't want to invent any new protocols. Use HTTP stack if possible. 

# Solutions


## A: Add HTTP Retrieval Client to Rainbow, leverage `Cache-Control: only-if-cached`

We know we need HTTP retrieval client for Kubo to enable [HTTP Gateway over Libp2p](https://github.com/ipfs/kubo/blob/master/docs/experimental-features.md#http-gateway-over-libp2p) by default, and to make direct HTTP retrieval from service providers more feasible. We can't do that without a client and end-to-end tests. Prototyping one in Rainbow sounds like a good plan, improving multiple  work streams at the same time.

The idea here is to introduce HTTP client which runs in addition, or in parallel to bitswap retrieval.
Keep it simple, don't mix abstractions, do opportunistic block retrieval like bitswap, but over HTTP.

Using `application/vnd.ipld.raw` and [trustless gateway protocol](https://specs.ipfs.tech/http-gateways/trustless-gateway/) is a good match here: allows us to benefit from HTTP caching and middleware, making it more flexible than bitswap.

Rainbow could:
- Have a list of other rainbow instances in form of URLs with trustless gateway endpoints
  - In case of ipfs.io gateway, we could produce a list with shuffled same-region instances first, and the rest of instances after them.
- Make inexpensive block requests with [`Cache-Control: only-if-cached`](https://specs.ipfs.tech/http-gateways/path-gateway/#only-if-cached) going over list in sequence.
  - This does not cost any expensive IO, if rainbow does not have the block locally, it will instantly respond with HTTP 412. 

This way, once a block lands in any of our rainbow caches, we will discover it, and requests won't timeout after 1m on unlucky scenarios. 

Open questions:
- Is sequential, inexpensive HTTP check enough to avoid amplification attacks?
- Ok to start at the same time as bitswap, or do we want to delay, and act as a fallback when we are unable to find block by regular means for (>10-30s)?

## B: Set up reverse proxy (nginx, lb) to try rainbows with  `Cache-Control: only-if-cached` first

Writing this down just to have something other than (A), I don't personally believe (B) is feasible.

The idea here is to update the way our infrastructure proxies gateway requests to rainbow instances, and first ask all upstream instances within the region for resource with `Cache-Control: only-if-cached`, and if none of them has the thing, retry with a normal request that will trigger p2p retrieval.

The downside here is that this _feels_ like antipattern:
- Overrides any user-provided `Cache-Control` 
- Creates cache hot spots: popular data is not distributed across rainbow instances, but always served by a specific instance which fetched it first.

## C: Reuse Bitswap client and server we already have

Right now, Rainbow runs Bitswap in read-only mode. It always says it does not have data when asked over bitswap.

What we could do is to a permissioned version of [peering](https://github.com/ipfs/kubo/blob/master/docs/config.md#peering):
- libp2p preconnect to safelisted set of peers and protect these peering connections from being closed
  - If Rainbow does not announce peer records to DHT, we should require full `/ip|dns*/.../p2p/peerid`, otherwise we  
- **(for now)** allow serving data over bitswap to safe-listed set of `/p2p/` multiaddrs (quick and easy), leverage existing peering config / libraries where possible (https://github.com/ipfs/rainbow/pull/35)
- (allows us to do more in the future) switch to HTTP retrieval (over libp2p or `/http`)

## D: ?

Ideas welcome.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check block cache across multiple rainbow instances #109

Problem

Scope

Solutions

A: Add HTTP Retrieval Client to Rainbow, leverage `Cache-Control: only-if-cached`

B: Set up reverse proxy (nginx, lb) to try rainbows with `Cache-Control: only-if-cached` first

C: Reuse Bitswap client and server we already have

D: ?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Check block cache across multiple rainbow instances #109

Description

Problem

Scope

Solutions

A: Add HTTP Retrieval Client to Rainbow, leverage Cache-Control: only-if-cached

B: Set up reverse proxy (nginx, lb) to try rainbows with Cache-Control: only-if-cached first

C: Reuse Bitswap client and server we already have

D: ?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

A: Add HTTP Retrieval Client to Rainbow, leverage `Cache-Control: only-if-cached`

B: Set up reverse proxy (nginx, lb) to try rainbows with `Cache-Control: only-if-cached` first