Skip to content

Fix for the moving average scores#35

Merged
Eugene-hu merged 2 commits intomainfrom
rewards_scatter_fix
Jun 9, 2023
Merged

Fix for the moving average scores#35
Eugene-hu merged 2 commits intomainfrom
rewards_scatter_fix

Conversation

@Eugene-hu
Copy link
Contributor

  • Fixes the scatter in the validator to reward only those that queried
  • Adds in additional normalization for followup and answer rewards

self.moving_averaged_scores.scatter(0, answer_uids, answer_rewards)
)
rewards = scattered_followup_rewards + scattered_answer_rewards
rewards = (scattered_followup_rewards + scattered_answer_rewards)/2
Copy link

@adriansmares adriansmares Jun 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you're dividing by 2 since the UIDs outside of followup_uids and answer_uids will be added twice, but you're also halving the score for the UIDs in followup_uids and answer_uids.

Why not scatter the answer rewards over the scattered follow up rewards ? followup_uids and answer_uids are disjoint, and then you count the scores of the UIDs which were not queried only once, without changing other scores.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like this, given that we make followup_uids and answer_uids mutually exclusive.

    # Compute forward pass rewards, assumes followup_uids and answer_uids are mutually exclusive.
    rewards = self.moving_averaged_scores.scatter(0, followup_uids, followup_rewards)
    rewards = rewards.scatter(0, answer_uids, answer_rewards)

@opentaco opentaco changed the title For for the moving average scores Fix for the moving average scores Jun 9, 2023
Copy link
Contributor

@isabella618033 isabella618033 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Eugene-hu Eugene-hu merged commit 6add5d9 into main Jun 9, 2023
@steffencruz steffencruz deleted the rewards_scatter_fix branch June 22, 2023 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants