Rehash retrieval proxies / multiple transports / inconsistent source attack protection (aka "smart retrievals") by aesedepece · Pull Request #2241 · witnet/witnet-rust

aesedepece · 2022-07-29T16:16:02Z

This is a rehash of #1523

Enables using proxies (HTTP, HTTPS, SOCKS4 and SOCKS5) when performing data retrieval. This allows retrieving data sources through different transports so as to ensure that the data sources are consistent and we are taking as small of a risk as
possible when committing to specially crafted data requests that may be potentially ill-intended.

Just add this to the [connections] section of witnet.toml to proxy your retrievals through 3 different open proxies:

retrieval_proxies = [
        "socks5://181.102.35.195:1080",
        "socks5://8.210.60.118:1080",
        "socks5://47.114.8.133:1080"
]

These addresses have been taken from this public list of proxies.

Close #1274
Close #1511

aesedepece · 2022-07-29T16:18:42Z

This is still a draft because tests need yet to be fixed, and I want to improve some other parts.

tmpolaczyk

Some comments.

node/src/actors/rad_manager/mod.rs

rad/src/lib.rs

net/src/client/http/mod.rs

witnet.toml

rad/src/lib.rs

tmpolaczyk · 2022-08-03T16:23:16Z

node/src/actors/rad_manager/mod.rs

-pub struct RadManager;
+pub struct RadManager {
+    /// Addresses of proxies to be used as extra transports when performing data retrieval.
+    proxies: Vec<String>,


Proxy addresses are only validated when executing a data request, so if the user inputs an invalid proxy they have no way to detect it until their node can resolve a data request.

So it would be nice to have either a way to validate the proxies on startup, or a JSON-RPC command to test the execution of a data request using proxies. If the old "try data request" command uses these proxies we can refer to it in the documentation as the expected way to ensure that the node is configured correctly.

So it would be nice to have either a way to validate the proxies on startup

Let me give this a further thought on how to approach it properly.

or a JSON-RPC command to test the execution of a data request using proxies. If the old "try data request" command uses these proxies we can refer to it in the documentation as the expected way to ensure that the node is configured correctly.

I think I prefer a more automated way 🤔

In any case, try_data_request is not currently using the proxies (as it implements its own retrieval logic that is not unified with that of RadManager). I guess it would be nice to have it use the proxies though, for it to be consistent with the actual witnessing behavior of the node and, as you pointed out, as a funny way to test the proxies functionality.

@tmpolaczyk which of these do you think is the right approach to performing validation of proxies upon startup?

Simply validating the addresses

Checking that we can connect to a socket at those addresses

Somehow handshaking with the proxy (different for each supported proxy protocol)

Running try_data_request and expecting a success

Specially in this last case, I think I would give the option of disabling the validation through configuration, so a downtime of the data sources doesn't prevent Witnet nodes from (re)starting.

Validating the addresses should be done to detect typos. Checking that we can connect or handshaking may not always work, so I prefer a more end to end test.

When running try_data_request from RadManager it is really hard to guess which proxy failed, the logs never mention the proxy url, so we need to run try_data_request once per proxy, and add a log that clearly says if a proxy failed (and stop the node in case of failure?). Creating a test data request to google.com should work. I guess we can simply add a config variable like witnessing.http_test_url = "https://google.com", and allow false to disable that check.

Basic URL validation functionality is now in place:

$ RUST_LOG=witnet_rad=debug cargo run --release -- -c ~/.witnet/config/witnet.toml wallet server Loading config from: witnet.toml Setting log level to: DEBUG, source: Env [2022-08-11T17:49:25Z INFO witnet::cli] Sentry telemetry enabled [2022-08-11T17:49:25Z DEBUG witnet_data_structures] Set environment to mainnet [2022-08-11T17:49:25Z WARN witnet_wallet] The local Witnet node JSON-RPC server is configured to listen at 0.0.0.0:21338 but the wallet will connect to 127.0.0.1:21338 [2022-08-11T17:49:26Z INFO witnet_config::config] The default unproxied HTTP transport for retrieval is enabled. [2022-08-11T17:49:26Z INFO witnet_config::config] Configuring retrieval proxies: ["socks5://127.0.0.1", "whatever://192.111.135.17:18302", "laleche"] [2022-08-11T17:49:26Z INFO witnet_config::config] Paranoid retrieval percentage is set to 51% Error: The following transport addresses are invalid: - socks5://127.0.0.1 (the address is missing a port number) - whatever://192.111.135.17:18302 ("whatever" is not a supported type of transport) - laleche (relative URL without a base)

aesedepece · 2022-08-08T11:44:35Z

Thanks @tmpolaczyk for all your suggestions. I'll try to address them all today.

tmpolaczyk · 2022-08-10T15:45:15Z

node/src/actors/chain_manager/mining.rs

+                                        log::warn!("Refraining not to commit to data request {} because the sources are apparently inconsistent", dr_pointer);
+                                        return Err(())


I believe in this case it would be better for the network safety to commit the RadError::InconsistentSource than to refrain from commiting.

In #2050 we discuss possible attacks that arise from data requests that can only be solved by a small fraction of all the nodes. I believe the same arguments apply to the inconsistent source attack, because nodes that enable this inconsistent source attack protection will not participate in that requests.

Imagine an attacker that creates a data request that returns a bad value for some nodes in order to penalize them, the classical inconsistent source attack. If some nodes refrain from participating in that request, then the attacker will control a higher percentage of the quorum, leading to the attacker gaining more reputation. This is true until a majority of nodes enable inconsistent source attack protection, then the request will be resolved with a RadError::InsufficientCommits.

On the other hand, if the eligible nodes commit an error, then all the nodes will participate in solving this request, so it will be equally hard for the attacker to create an eligibility proof for this request than for any other request. If the majority of nodes have enabled inconsistent source attack protection, it is possible that the data request is solved with a radon error as a result, and in that case nobody gains reputation. And in the case where only a minority have enabled inconsistent source attack protection, there will be a few error commits in the data request, so the attack should still work, with a slightly lower profit.

Hmmm... let me give that a thought. This is not a trivial point because what you are suggesting actually requires a protocol change to assign an identifier for InconsistentSource in RadonErrors.

Of course I had already created such identifier, but this was a non-consensus-breaking change only because we are never putting that error into a commitment.

True, in that case we could commit a RadError::Unknown, and in a future pull request add the logic to enable a new RadonError using TAPI.

I have solved it like this:

.map_err( // The `InconsistentSource` error is mapped here for the sake of backwards // compatibility. Namely, to enable paranoid retrieval without having to // immediately introduce a breaking change that may jeopardize oracle // queries. The point of making the mapping here is to only affect actual // witnessing and committing, but not the `try_data_request` function, which // can rather use the `InconsistentSource` error for clarity. // TODO: pursue a WIP that introduces `InconsistentSource` as a proper // RadonError at the protocol level. |err| match err { RadError::InconsistentSource => RadError::Unknown, other => other, }, )

Interesting. As nodes are already refraining from committing in such cases... am I right to assume that we can change this part and start committing RadonError::Unknown in this "everything went wrong" case without breaking consensus?

In that case, we would start to see reveals with RadonError::Unknown, which currently are rare if not non-existing.

The thing is that the lack of commits caused by the code block you are pointing out is probably not even having an impact on oracle queries as of today, because those missing commits should be allowing other valid commits to take their place, under the assumption that there are always more eligible identities than necessary.

If that's the case, and we start to default to committing an error, that would most times lead to worse tallies than not doing so, because miners can't tell errors from values while at the commitment stage.

I right to assume that we can change this part and start committing RadonError::Unknown in this "everything went wrong" case without breaking consensus?

Should be safe, because nodes can always commit an error. So the result would be the same as a sudden increase in http errors or timeout errors.

If that's the case, and we start to default to committing an error, that would most times lead to worse tallies than not doing so, because miners can't tell errors from values while at the commitment stage.

True, then I guess the right approach would be to keep the old behavior for now, and investigate about the consequences of changing it in the future. Would the ability to distinguish values from errors in the commit stage be useful? Or would it introduce even more attack vectors?

Would the ability to distinguish values from errors in the commit stage be useful? Or would it introduce even more attack vectors?

Idk, leaking that info at the commitment stage makes it to easy in theory for lazy witnesses not to perform the retrieval and simply copy the error from other commitments once they have seen rf * consensus_requirement of those (because at that point they will know in advance that the only value they can commit without being slashed must be an error).

In the other hand, that would allow to skip the reveal phase 🤔

I was thinking about adding a flag to the commit like is_error: bool, this way lazy witnesses can only guess the error. And I believe if your error is different from the consensus error you miss the reward, or something like that, no sure because that logic is complex. That would allow miners to choose non-error commits, but it will also allow miners to choose error commits to sabotage a data request.

tmpolaczyk · 2022-08-11T10:11:02Z

I just found a very funny issue, this PR breaks RNG requests because the nodes commit a "ModeTie" error instead of a random value.

rad/src/lib.rs

tmpolaczyk · 2022-08-11T11:41:08Z

When trying to use a public socks5 proxy, I get this error when trying a request with https sources (didn't test http):

[2022-08-11T11:36:19Z WARN  isahc::handler] request completed with error: the server certificate could not be validated

witnet.toml:

proxies = [
        "socks5://72.217.216.239:4145",
]

I tried other proxies and I got either the same error or a "retrive timeout" error, so not sure if I had bad luck choosing the proxies or it doesn't work as expected for some reason.

aesedepece · 2022-08-11T12:16:42Z

I tried other proxies and I got either the same error or a "retrive timeout" error, so not sure if I had bad luck choosing the proxies or it doesn't work as expected for some reason.

I tried a few proxy addresses as well and found out that most of those that you can find out there are garbage, regardless of being SOCKS or HTTP.

The only one that worked like a charm was the SOCKS5 interface on port 9150 from my own Tor Browser. This added like 1-2s of latency but still all the retrievals were successful (except one API that gave me 403 because they're behind a WAF or something).

tmpolaczyk · 2022-08-11T13:37:48Z

I tried a few proxy addresses as well and found out that most of those that you can find out there are garbage, regardless of being SOCKS or HTTP.

That seems to be true. I finally got it working by running a local socks server.

node/src/actors/node.rs

tmpolaczyk

Some comments for the WitnetHttpClient.

net/src/client/http/mod.rs

tmpolaczyk · 2022-09-23T13:42:15Z

node/tests/actors/rad_manager.rs

+    // This is dropping RadManager explicitly so as to get rid of the panic of doing so.
+    std::panic::catch_unwind(|| {
+        std::mem::drop(manager);
+    })
+    .ok();


Ideally this shouldn't be needed, I believe there is a bug in the stop_system_if_panicking function, which panics if the actor is being dropped in a panicking thread while no actix system is running. When testing other actors they always run inside an actix system so that's not a problem, but the RadManager is more standalone. So I believe that running inside an actix system should be optional, to avoid doing this workaround in other code that doesn't use actix. I'll see if it can be fixed.

Done, #2276

So this hack can be removed now?

tmpolaczyk · 2022-09-30T09:32:53Z

Proxy URLs are validated at startup, but there is no connection attempt to ensure that they work.

There is no easy way to manually check that proxies work:

The witnet toolkit binary cannot be used to test proxies because it doesn't read the witnet.toml file.
The witnet node sendRequest command reads the witnet.toml file but ignores the proxy options.
Sheikah connected to the wallet reads the witnet.toml and uses the witnessing config, but in case of timeout there is some kind of bug and sheikah displays the loading icon instead of the error. After waiting more than one minute it finally displays the error "Source looks inconsistent when...", but that error also doesn't mention which proxy failed.

So unless I missed something there is no way to check if the configured proxies work, it would be nice to have something like a test-request CLI command.

In the node, when one proxy fails with a timeout error, this is the only log:

[2022-09-30T08:27:10Z ERROR witnet_node::actors::chain_manager::mining] Couldn't resolve rad request 86920ddf5d7066b1db629066ab50118855c05bf7fbf3d4ba53e1154c3ca02b43: Timeout during retrieval phase

It would be nice if the error message could mention the URL of the proxy that failed.

aesedepece · 2022-10-03T08:31:57Z

Proxy URLs are validated at startup, but there is no connection attempt to ensure that they work.

In my mind I had implemented full validation (by querying some well-known URL). But it looks like that happened only in my mind.
Would you be OK if we go with this as it is and implement that or a CLI method (or both) in a further PR?

tmpolaczyk · 2022-10-03T11:58:54Z

Would you be OK if we go with this as it is and implement that or a CLI method (or both) in a further PR?

Yes, it could be a separate PR. I suggested it here in case that makes us rethink the configuration parameters or something, but the basic structure of the witnessing params looks solid enough.

Then if I haven't missed any breaking changes this is ready to merge.

@tmpolaczyk

…ansport as suggested by @tmpolaczyk

@tmpolaczyk

Made possible through reexporting `surf` was `witnet_net::surf`. As suggested by @tmpolaczyk

@tmpolaczyk

Also: - Fix bug about detecting liars - Add tests for `evaluate_paranoid_retrieval` - Improve comments and documentation - Fix usage of tally logic instead of aggregation logic when acting paranoid As suggested by @tmpolaczyk. Thanks!

@tmpolaczyk

as suggested by @tmpolaczyk

…evals

…ults and stuff

@tmpolaczyk

…mitting as suggested by @tmpolaczyk TODO: pursue a WIP that introduces `InconsistentSource` as a proper RadonError at the protocol level.

This is only validating that they are well-formed URLs and they are using a supported protocol (http, https, socks4 or socks5)

@tmpolaczyk

So as to prevent parsing and validating them every time they get used. As suggested by @tmpolaczyk

aesedepece marked this pull request as draft July 29, 2022 16:18

aesedepece marked this pull request as ready for review August 1, 2022 09:36

tmpolaczyk reviewed Aug 3, 2022

View reviewed changes

aesedepece force-pushed the feat/smart-retrievals branch 3 times, most recently from 32b3307 to c0c65f8 Compare August 9, 2022 19:05

tmpolaczyk reviewed Aug 10, 2022

View reviewed changes

aesedepece mentioned this pull request Aug 11, 2022

WIP adding InconsistentSource as a protocol level RadonError witnet/WIPs#86

Open

tmpolaczyk reviewed Aug 11, 2022

View reviewed changes

rad/src/lib.rs Show resolved Hide resolved

aesedepece force-pushed the feat/smart-retrievals branch from 7d65860 to 1a2168d Compare August 12, 2022 10:19

tmpolaczyk reviewed Aug 12, 2022

View reviewed changes

node/src/actors/node.rs Outdated Show resolved Hide resolved

tmpolaczyk reviewed Aug 12, 2022

View reviewed changes

aesedepece added this to the 1.5.3 milestone Aug 22, 2022

aesedepece self-assigned this Aug 22, 2022

aesedepece force-pushed the feat/smart-retrievals branch 2 times, most recently from 7fa65f4 to a3c7ee6 Compare August 24, 2022 11:33

aesedepece force-pushed the feat/smart-retrievals branch from a3c7ee6 to c59a11b Compare August 30, 2022 09:20

tmpolaczyk reviewed Sep 23, 2022

View reviewed changes

aesedepece force-pushed the feat/smart-retrievals branch from dd55fd9 to b6a95be Compare September 23, 2022 13:42

aesedepece requested a review from tmpolaczyk September 29, 2022 12:49

tmpolaczyk approved these changes Oct 3, 2022

View reviewed changes

aesedepece added 28 commits October 3, 2022 16:50

feat(rad_manager): allow disabling the default unproxied retrieval tr…

c34b75f

…ansport as suggested by @tmpolaczyk

tests(rad_manager): add tests for RadManager instantiation

4958dfb

refactor(rad): remove direct dependency on surf

eb3f2fe

Made possible through reexporting `surf` was `witnet_net::surf`. As suggested by @tmpolaczyk

feat(net): do not interfere in surf and isahc defaults for HTTP versions

85404b1

refactor(node): move RADAggregate structs instead of borrowing them

b3c25af

as suggested by @tmpolaczyk

refactor(node): move ActiveWips structs instead of borrowing them

5df12c0

feat: allow try_data_request and witnet_wallet to use smart retri…

e33986d

…evals

fix(rad): correctly handle contexts and errors for paranoid witnessing

be6d565

fix(rad): pass settings to paranoid retrieval to preserve partial res…

3429b29

…ults and stuff

feat(node): commit RadError::Unknown instead of refraining from com…

25310d0

…mitting as suggested by @tmpolaczyk TODO: pursue a WIP that introduces `InconsistentSource` as a proper RadonError at the protocol level.

refactor: intercept InconsistentSource in the right place

5e12f0b

fix(rad): skip paranoid checks for non-HTTP retrievals

511cccf

feat: validate transports addresses upon start of node or wallet

f3211d2

This is only validating that they are well-formed URLs and they are using a supported protocol (http, https, socks4 or socks5)

chore: fix clippy warnings for Rust 1.63

e094c74

feat(validations): improve error messages for WitnessingConfigError

9f8b9e0

refactor(net): remove redundant Arc on WitnetHttpClient

6f5c18a

chore(net): fix typo

59c95de

refactor(net): avoid unnecessary unwrap

c29b528

refactor(net): avoid redundant cloning

1dbe8d2

refactor(net): handle surf errors better

66cb09f

refactor(net): simplify headers iterator

f2680c1

refactor(net): simplify conversion of HTTP versions

1a53129

chore(net): clarify a comment about surf responses

56c63d8

feat: keep proxy addresses as Uri values instead of String

d2df298

So as to prevent parsing and validating them every time they get used. As suggested by @tmpolaczyk

tests(node): fix RadManager tests that were accidentally panicking

0d81b94

chore: lint code to satisfy latest clippy rules

7b94e4f

refactor(node): remove redundant explicit drop after witnet#2276

6a443bb

aesedepece force-pushed the feat/smart-retrievals branch from 6c5fabb to 6a443bb Compare October 3, 2022 15:04

aesedepece merged commit 6a443bb into witnet:master Oct 3, 2022

		log::warn!("Refraining not to commit to data request {} because the sources are apparently inconsistent", dr_pointer);
		return Err(())

Conversation

aesedepece commented Jul 29, 2022

Uh oh!

aesedepece commented Jul 29, 2022

Uh oh!

tmpolaczyk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aesedepece Aug 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aesedepece commented Aug 8, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tmpolaczyk commented Aug 11, 2022

Uh oh!

Uh oh!

tmpolaczyk commented Aug 11, 2022

Uh oh!

aesedepece commented Aug 11, 2022

Uh oh!

tmpolaczyk commented Aug 11, 2022

Uh oh!

Uh oh!

tmpolaczyk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aesedepece Aug 11, 2022 •

edited

Loading