Skip to content

Add PPOTrainer Unit Test#520

Closed
igfox wants to merge 1 commit intofacebookresearch:masterfrom
igfox:export-D30114686
Closed

Add PPOTrainer Unit Test#520
igfox wants to merge 1 commit intofacebookresearch:masterfrom
igfox:export-D30114686

Conversation

@igfox
Copy link
Contributor

@igfox igfox commented Aug 5, 2021

Summary:
Adds dedicated unit test for PPO Trainer, additionally:

  • Fixes a bug with fully connected value net
  • Fixes some bugs in PPO training around using value net
  • Adds possible_action_mask to DuelingQNetwork

Differential Revision: D30114686

Summary:
Adds dedicated unit test for PPO Trainer, additionally:
- Fixes a bug with fully connected value net
- Fixes some bugs in PPO training around using value net
- Adds possible_action_mask to DuelingQNetwork

Differential Revision: D30114686

fbshipit-source-id: aaf773f36cafd2a1cf993a554eab1363e9a65c20
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D30114686

@codecov-commenter
Copy link

Codecov Report

Merging #520 (9984e99) into master (7d5bdbf) will increase coverage by 0.20%.
The diff coverage is 98.48%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #520      +/-   ##
==========================================
+ Coverage   84.18%   84.39%   +0.20%     
==========================================
  Files         327      328       +1     
  Lines       19397    19514     +117     
  Branches       44       44              
==========================================
+ Hits        16329    16468     +139     
+ Misses       3042     3020      -22     
  Partials       26       26              
Impacted Files Coverage Δ
reagent/net_builder/value/fully_connected.py 100.00% <ø> (ø)
reagent/core/types.py 87.26% <75.00%> (+0.79%) ⬆️
reagent/models/dueling_q_network.py 96.15% <87.50%> (-0.79%) ⬇️
reagent/model_managers/policy_gradient/ppo.py 88.52% <100.00%> (-0.37%) ⬇️
reagent/models/fully_connected_network.py 88.46% <100.00%> (+0.70%) ⬆️
reagent/test/training/test_ppo.py 100.00% <100.00%> (ø)
reagent/training/ppo_trainer.py 99.09% <100.00%> (+14.41%) ⬆️
reagent/gym/policies/samplers/discrete_sampler.py 58.51% <0.00%> (+3.19%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7d5bdbf...9984e99. Read the comment docs.

@facebook-github-bot
Copy link

This pull request has been merged in 8d00eb1.

@facebook-github-bot
Copy link

This pull request has been reverted by 04fab8f.

xuruiyang pushed a commit that referenced this pull request Sep 20, 2025
Summary:
Pull Request resolved: #520

Adds dedicated unit test for PPO Trainer, additionally:
- Fixes a bug with fully connected value net
- Fixes some bugs in PPO training around using value net
- Adds possible_action_mask to DuelingQNetwork

Reviewed By: czxttkl

Differential Revision: D30114686

fbshipit-source-id: 3735af1ea65429867d63f7da1462194242ad8254
@facebook-github-bot
Copy link

This pull request has been reverted by e5355f8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants