Block Triggers hash write-up

One of The Graph's coolest features is that queries are deterministic, and
given a `Qm` subgraph hash, indexing it should **always** give the same result.
This is possible because we inherit the blockchain's determinism property,
however there's a big loophole which can break this amazing feature, which is
**the chain provider**.

Currently the main (or only) type of connection we give as option to indexers
(in The Graph Network) is the **JSON-RPC** one. To use it, they can either run a
node themselves or use a third party service like Alchemy. Either way the
provider can be faulty and give incorrect results for a number of different
reasons.

To be a little more specific, let's say there are indexers/nodes `A` and `B`.
Both are indexing subgraph `Z`. Indexer `A` is using Alchemy and `B` is using
Infura.

Given a block `14_722_714` of a determined hash, both providers will very
likely give the same result for these two values (block number and hash),
however other fields such as `gas_used` or `total_difficulty` could be
incorrect. And yes, ideally they would always be correct since they are chain
providers, that's their main job, however what I'm describing is the exact
issue we've faced when testing indexing Ethereum mainnet with the Firehose.

These field/value differences between providers are directly fed into the
subgraph mappings, which are the current input of the POI algorithm and the
base of The Graph's determinism property. Not taking the possible faultyness
of the chain providers into account, can break determinism altogether.

And the biggest problem today is that, to spot these POI differences, we have
to index subgraphs that **use those values in their mappings**. If by any chance
in Firehose shootout we've done in the **integration cluster**, there were no
subgraphs using these values **we wouldn't spot any POI differences**, which
is a **very severe issue**.

POI differences described in the Firehose shootout for reference:
https://gist.github.com/evaporei/660e57d95e6140ca877f338426cea200.

So in summary, the problems being described above are:

- That currently we consider the **chain provider** as a **source of truth**,
 which can only be questioned in behalf of re-orgs;
- We don't have a good way to compare provider input (that could spot POI
 differences) without the indirection of a subgraph mapping.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Block Triggers hash write-up #3554

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Block Triggers hash write-up #3554

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions