Skip to content

[BIT-594] Decrease validator moving average window#971

Merged
opentaco merged 3 commits intonobunagafrom
feature/BIT-594/decrease_validator_moving_average_window
Nov 9, 2022
Merged

[BIT-594] Decrease validator moving average window#971
opentaco merged 3 commits intonobunagafrom
feature/BIT-594/decrease_validator_moving_average_window

Conversation

@opentaco
Copy link
Contributor

@opentaco opentaco commented Nov 2, 2022

BIT-594 Decrease validator moving average window

Decrease validator moving average window from 20 (alpha=0.05) to 10 (alpha=0.1) steps. This parameter could probably eventually be set to alpha=0.2.

The current 20-step window means that a server model change will take 20 steps * ~250 blocks/epoch * 12 sec = approx. 17 hours to reach full score in the validator neuron stats, because of the moving average slowly weighing in new model performance. 17 hours is probably too long, and it is also likely affecting registration immunity.

Decrease validator moving average window from 20 (alpha=0.05) to 10 (alpha=0.1) steps. This parameter could probably eventually be set to alpha=0.2.

The current 20-step window means that a server model change will take 20 steps * ~250 blocks/epoch * 12 sec = approx. 17 hours to reach full score in the validator neuron stats, because of the moving average slowly weighing in new model performance. 17 hours is probably too long, and it is also likely affecting registration immunity.
@opentaco opentaco changed the title Decrease validator moving average window [BIT-594] Decrease validator moving average window Nov 2, 2022
Copy link
Contributor

@Eugene-hu Eugene-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, currently it takes too long

@opentaco opentaco requested a review from unconst November 3, 2022 15:24
Copy link
Contributor

@isabella618033 isabella618033 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Question tho, have we also considered changing blocks/epoch? Do we still need that to be as high as 250 * 12 sec = 50 mins?

If we push it to topk = 100, that means we need 4096 / 100 (topk) * 15 sec = 10 mins to sample through all 4096 peers. And 250 blocks/epoch means that we can sample through the network 5 times in an epoch.
And 20 steps means that we can sample the network 5*20 = 100 times before setting weight. (If we push topk from 20 -> 100)

@opentaco
Copy link
Contributor Author

opentaco commented Nov 7, 2022

LGTM

Question tho, have we also considered changing blocks/epoch? Do we still need that to be as high as 250 * 12 sec = 50 mins?

If we push it to topk = 100, that means we need 4096 / 100 (topk) * 15 sec = 10 mins to sample through all 4096 peers. And 250 blocks/epoch means that we can sample through the network 5 times in an epoch. And 20 steps means that we can sample the network 5*20 = 100 times before setting weight. (If we push topk from 20 -> 100)

Increasing topk and decreasing blocks/epoch should come later to speed up network cycling time, although it may likely have the effect of increasing server load, which will require more server memory/resources, so we'd need to gradually phase in larger topk.

@unconst
Copy link
Contributor

unconst commented Nov 7, 2022

Is this ready to merge?

@opentaco opentaco merged commit 58ae9d6 into nobunaga Nov 9, 2022
@opentaco opentaco deleted the feature/BIT-594/decrease_validator_moving_average_window branch November 9, 2022 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants