Improve SARSA in FREE by alexnikulkov · Pull Request #643 · facebookresearch/ReAgent

alexnikulkov · 2022-05-23T21:08:12Z

Summary:

Add new sections to YAML for model and optimizer configs
Add support for weights in Parametric DQN input
Expose FC hidden layer dims in config
Sort data in the batch by separable_id, timestamp, position.
Zero-out the weight for observations for which we don't know the next state ("terminal", but they are actually not terminal, we just don't know their next state), the time_diff is negative or the position feature is missing, preventing us from sorting properly.
Read and pass in the batch time gap to next state
Clip reward (paced bid)

Differential Revision: D36360500

Summary: 1. Add new sections to YAML for model and optimizer configs 2. Add support for weights in Parametric DQN input 3. Expose FC hidden layer dims in config 4. Sort data in the batch by separable_id, timestamp, position. 5. Zero-out the weight for observations for which we don't know the next state ("terminal", but they are actually not terminal, we just don't know their next state), the time_diff is negative or the position feature is missing, preventing us from sorting properly. 6. Read and pass in the batch time gap to next state 7. Clip reward (paced bid) Differential Revision: D36360500 fbshipit-source-id: 9ed29c367627753e8f801ee90a6d0042bac006dd

facebook-github-bot · 2022-05-23T21:08:31Z

This pull request was exported from Phabricator. Differential Revision: D36360500

codecov-commenter · 2022-05-23T22:17:55Z

Codecov Report

Merging #643 (dc806ee) into main (deb9c67) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #643      +/-   ##
==========================================
- Coverage   87.02%   87.02%   -0.01%     
==========================================
  Files         354      354              
  Lines       22442    22443       +1     
  Branches       44       44              
==========================================
  Hits        19531    19531              
- Misses       2885     2886       +1     
  Partials       26       26

Impacted Files	Coverage Δ
reagent/core/types.py	`86.70% <100.00%> (+0.02%)`	⬆️
reagent/mab/ucb.py	`86.84% <0.00%> (-2.64%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update deb9c67...dc806ee. Read the comment docs.

Summary: Pull Request resolved: #643 1. Add new sections to YAML for model and optimizer configs 2. Add support for weights in Parametric DQN input 3. Expose FC hidden layer dims in config 4. Sort data in the batch by separable_id, timestamp, position. 5. Zero-out the weight for observations for which we don't know the next state ("terminal", but they are actually not terminal, we just don't know their next state), the time_diff is negative or the position feature is missing, preventing us from sorting properly. 6. Read and pass in the batch time gap to next state 7. Clip reward (paced bid) To launch MC LTV training: - local run: `starlight app run -j 1 free.reagent.train_ltv:train` - submit to MAST: `starlight app submit reagent/submit_config.py:get_config_ltv` To launch SARSA LTV training: - local run: `starlight app run -j 1 free.reagent.train_ltv:train_sarsa` - submit to MAST: `starlight app submit reagent/submit_config.py:get_config_ltv -- --model_type SARSA` Reviewed By: czxttkl Differential Revision: D36360500 fbshipit-source-id: c07f0b2ea297844970389b2059a7c42d63d16a8d

facebook-github-bot added cla signed fb-exported labels May 23, 2022

facebook-github-bot closed this in 18af427 May 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve SARSA in FREE#643

Improve SARSA in FREE#643
alexnikulkov wants to merge 1 commit intofacebookresearch:mainfrom
alexnikulkov:export-D36360500

alexnikulkov commented May 23, 2022

Uh oh!

facebook-github-bot commented May 23, 2022

Uh oh!

codecov-commenter commented May 23, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

alexnikulkov commented May 23, 2022

Uh oh!

facebook-github-bot commented May 23, 2022

Uh oh!

codecov-commenter commented May 23, 2022

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants