[BIT-594] Decrease validator moving average window by opentaco · Pull Request #971 · opentensor/bittensor

opentaco · 2022-11-02T10:06:02Z

BIT-594 Decrease validator moving average window

Decrease validator moving average window from 20 (alpha=0.05) to 10 (alpha=0.1) steps. This parameter could probably eventually be set to alpha=0.2.

The current 20-step window means that a server model change will take 20 steps * ~250 blocks/epoch * 12 sec = approx. 17 hours to reach full score in the validator neuron stats, because of the moving average slowly weighing in new model performance. 17 hours is probably too long, and it is also likely affecting registration immunity.

Decrease validator moving average window from 20 (alpha=0.05) to 10 (alpha=0.1) steps. This parameter could probably eventually be set to alpha=0.2. The current 20-step window means that a server model change will take 20 steps * ~250 blocks/epoch * 12 sec = approx. 17 hours to reach full score in the validator neuron stats, because of the moving average slowly weighing in new model performance. 17 hours is probably too long, and it is also likely affecting registration immunity.

…g_average_window

Eugene-hu

I agree, currently it takes too long

isabella618033

LGTM

Question tho, have we also considered changing blocks/epoch? Do we still need that to be as high as 250 * 12 sec = 50 mins?

If we push it to topk = 100, that means we need 4096 / 100 (topk) * 15 sec = 10 mins to sample through all 4096 peers. And 250 blocks/epoch means that we can sample through the network 5 times in an epoch.
And 20 steps means that we can sample the network 5*20 = 100 times before setting weight. (If we push topk from 20 -> 100)

opentaco · 2022-11-07T13:58:40Z

LGTM

Question tho, have we also considered changing blocks/epoch? Do we still need that to be as high as 250 * 12 sec = 50 mins?

If we push it to topk = 100, that means we need 4096 / 100 (topk) * 15 sec = 10 mins to sample through all 4096 peers. And 250 blocks/epoch means that we can sample through the network 5 times in an epoch. And 20 steps means that we can sample the network 5*20 = 100 times before setting weight. (If we push topk from 20 -> 100)

Increasing topk and decreasing blocks/epoch should come later to speed up network cycling time, although it may likely have the effect of increasing server load, which will require more server memory/resources, so we'd need to gradually phase in larger topk.

unconst · 2022-11-07T19:59:21Z

Is this ready to merge?

…g_average_window

opentaco changed the title ~~Decrease validator moving average window~~ [BIT-594] Decrease validator moving average window Nov 2, 2022

opentaco requested review from Eugene-hu and isabella618033 November 2, 2022 10:07

Merge branch 'nobunaga' into feature/BIT-594/decrease_validator_movin…

da406b2

…g_average_window

Eugene-hu approved these changes Nov 2, 2022

View reviewed changes

opentaco requested a review from unconst November 3, 2022 15:24

isabella618033 approved these changes Nov 3, 2022

View reviewed changes

Merge branch 'nobunaga' into feature/BIT-594/decrease_validator_movin…

ba34f1c

…g_average_window

unconst approved these changes Nov 9, 2022

View reviewed changes

opentaco merged commit 58ae9d6 into nobunaga Nov 9, 2022

opentaco deleted the feature/BIT-594/decrease_validator_moving_average_window branch November 9, 2022 14:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BIT-594] Decrease validator moving average window#971

[BIT-594] Decrease validator moving average window#971
opentaco merged 3 commits intonobunagafrom
feature/BIT-594/decrease_validator_moving_average_window

opentaco commented Nov 2, 2022 •

edited by atlassian bot

Loading

Uh oh!

Eugene-hu left a comment

Uh oh!

isabella618033 left a comment •

edited

Loading

Uh oh!

opentaco commented Nov 7, 2022

Uh oh!

unconst commented Nov 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

opentaco commented Nov 2, 2022 • edited by atlassian bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!