Refactor CB trainers in reagent to integrate Offline Eval#694
Closed
alexnikulkov wants to merge 2 commits intofacebookresearch:mainfrom
Closed
Refactor CB trainers in reagent to integrate Offline Eval#694alexnikulkov wants to merge 2 commits intofacebookresearch:mainfrom
alexnikulkov wants to merge 2 commits intofacebookresearch:mainfrom
Conversation
|
This pull request was exported from Phabricator. Differential Revision: D41239491 |
Differential Revision: D41226450 fbshipit-source-id: 4d19b8c113eea031a598ec515a477e247507100c
Summary: 1. Instead of inheriting CB trainers from `ReAgentLightningModule` (we weren't using any custom methods/attributed from this class), I created a separate `BaseCBTrainerWithEval` base class for all CB reagent trainers 2. `BaseCBTrainerWithEval` integrates Offline Eval into the training process. By default the behavior is same as before refactor. But after `.attach_eval_module()` method gets called, every batch is processed by the eval module before training on it. The processing includes keeping track of the reward and filtering the training batch. Differential Revision: D41239491 fbshipit-source-id: f5c2bf64a9584e1f6a59d14bb5e07a40089ac93d
184fae0 to
90b7929
Compare
|
This pull request was exported from Phabricator. Differential Revision: D41239491 |
|
This pull request has been merged in 7cb5500. |
xuruiyang
pushed a commit
that referenced
this pull request
Sep 20, 2025
Summary: Pull Request resolved: #694 `BaseCBTrainerWithEval` integrates Offline Eval into the training process. By default the behavior is same as before refactor. But after `.attach_eval_module()` method gets called, every batch is processed by the eval module before training on it. The processing includes keeping track of the reward and filtering the training batch. Reviewed By: BerenLuthien Differential Revision: D41239491 fbshipit-source-id: f5c506d14a736a71ddc1b64270d1e8842a23488b
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Differential Revision: D41239491