Skip to content

Length time reward models#100

Closed
Eugene-hu wants to merge 15 commits intostagingfrom
length_time_reward_models
Closed

Length time reward models#100
Eugene-hu wants to merge 15 commits intostagingfrom
length_time_reward_models

Conversation

@Eugene-hu
Copy link
Contributor

@Eugene-hu Eugene-hu commented Jul 20, 2023

  • adds two additional reward models based on length and time. These rewards are normalized based on the event type. Each event type will follow a different normalization distribution.
  • These two rewards models allow us to directly reward the speed of the miners on network and put emphasis on the fidelity of responses.

@mrseeker
Copy link
Contributor

To give a nice understanding of why this should not be used:

This will heavily skew results in favour of sending a lot of garbage in a small amount of time. Better to use it as a "cut-off" (minimum tokens required) and time-out as a combination to control minimum token speed.

prompt (str): The prompt.
responses (List[bt.DendriteCall]): The list of dendrite calls.
name (str): The name.
test (bool): A boolean indicating whether or not this is a test run. Default is False.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused?

@Eugene-hu
Copy link
Contributor Author

This will heavily skew results in favour of sending a lot of garbage in a small amount of time. Better to use it as a "cut-off" (minimum tokens required) and time-out as a combination to control minimum token speed.

I agree with the concern and we are adding additional safeguards to limit the effects of the system. Most of the reward will still be based on the reward models, however, this will give us a way to accurately tune how much emphasis to put on the fidelity of responses. We will continue to monitor the system after launch to ensure that the models are working as expected.

@Eugene-hu Eugene-hu marked this pull request as ready for review July 24, 2023 22:24
@Eugene-hu Eugene-hu changed the base branch from main to staging July 24, 2023 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants