Skip to content

fully exhaust query map#144

Merged
thewhaleking merged 4 commits intostagingfrom
feat/thewhaleking/fully-exhaust-query-map
Jun 30, 2025
Merged

fully exhaust query map#144
thewhaleking merged 4 commits intostagingfrom
feat/thewhaleking/fully-exhaust-query-map

Conversation

@thewhaleking
Copy link
Collaborator

Adds an arg to async substrate query map to fully exhaust the query map results rather than synchronously getting them by pagination.

In bigger tests such as SubtensorModule.OwnedHotkeys, the speed improvement is 5-6X faster.

@thewhaleking thewhaleking requested a review from a team June 27, 2025 16:19
@thewhaleking thewhaleking added the enhancement New feature or request label Jun 27, 2025
Copy link
Contributor

@basfroman basfroman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a few questions

Comment on lines +3517 to +3521
self.rpc_request(
method="state_queryStorageAt",
params=[batch_keys, block_hash],
runtime=runtime,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if one of the self.rpc_requests raises an error? For example, due to a short, temporary bad connection? Or a broken data structure?
You don't want to use return_exceptions=True for the gather and then filter the result? or use try except with logging in case of error?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'd rather not. The problem with returning exceptions is that we would have one of three options at that point:

  1. log the exception, and return partial data
  2. Return no data, and re-raise the exception
  3. Retry the failed calls

By not returning the exception, we're implicitly choosing option 2. Option 1 would be fine if we trusted people to correctly examine logs before moving on. However, in the case where most of this will not be watched before being used, we cannot rely on someone to say "okay, this is incomplete data, what now must I do with it?", and also the fact that this only occurs when using fully_exhaust=True, which is stated to only be used if you want all (read: not partial) data. Therefore, I think this is the best option.

Option 3 doesn't make sense, because those calls are already retried if the exception is due to a websocket timeout, and if not can be caught be something else (such as with RetryAsyncSubstrate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, we only consider the case where ALL results are successful or we get an error. Right?

@basfroman basfroman added run-bittensor-sdk-tests Runs Bittensor SDK tests. run-bittensor-cli-tests Runs BTCLI tests. labels Jun 27, 2025
Base automatically changed from feat/thewhaleking/python-ss58-conversion to staging June 30, 2025 14:13
@thewhaleking thewhaleking merged commit 35728ab into staging Jun 30, 2025
@thewhaleking thewhaleking deleted the feat/thewhaleking/fully-exhaust-query-map branch June 30, 2025 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request run-bittensor-cli-tests Runs BTCLI tests. run-bittensor-sdk-tests Runs Bittensor SDK tests.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants