Skip to content

Conversation

@karroyan
Copy link
Owner

Description

Related Issue

TODO

Check List

  • merge the latest version source branch/repo, and resolve all the conflicts
  • pass style check
  • pass all the tests

PaParaZz1 and others added 8 commits February 12, 2023 20:14
* feature(nyz): setup evogym docker

* fix(nyz): correct evogym docker
* feature(nyz): add bipedalwalker demo and fix rounding bug

* feature(nyz): add seed_api option and ch3 config

* fix(nyz): fix env clone bug, add attention policy

* polish(nyz): add cloudpickle clone and fix minor bugs

* feature(nyz): add evogym env config
* embed metadrive into di-enginej

* adapt to the latest version of metadrive

* adapt to metadrive

* adapt metadrive feature

* change decision frequency

* fix some problems

* add show bird view observation

* add metadrive in readme.md
PaParaZz1 and others added 20 commits February 18, 2023 22:01
* Unsqueeze action_args in PDQN when its shape is 1

* Parameterize PDQN unittest
overide -> override (ci skip)
* polish(pu): polish icm_onppo_config

* polish(pu): polish icm rnd intrinsic_reward_weight and config

* style(pu): yapf format

* fix(lisong): fix config bugs and app_key env bugs

* polish(lisong): polish icm/rnd config and reward model

* fix(lisong): add viewsizerapper in minigrid_wrapper

* fix(lisong): add doorkey8x8 rnd+onppo config,save reward model, fix replay save

* fix(pu): fix augmented_reward tb_logging

* feat(lisong): add noisy-tv env in minigrid

* fix(lisong): modify noisy_tv env
* polish ppof rewardclip and add atari config

* polish config

* polish style

* polish config and ppof

* polish style

* polish yapf style
* draft runable verison for mdqn and config file

* fix style for mdqn

* fix style for mdqn

* update action_gap part for mdqn

* provide tau and alpha

* draft runable verison for mdqn and config file

* fix style for mdqn

* fix style for mdqn

* update action_gap part for mdqn

* provide tau and alpha

* add clipfrac to mdqn

* add unit test for mdqn td error

* provide current exp parameter

* fix bug in mdqn td loss function and polish code

* revert useless change in dqn

* update readme for mdqn

* delete wring named folder

* rename asterix folder

* provide resonable config for asterix

* fix style and unit test

* polish code under comment

* fix typo in dizoo asterix config

* fix style

* fix style

* provide is_dynamic_seed for collector env

* add unit test for mdqn in test_serial_entry with asterix

* change test for mdqn from asterix to cartpole because of platform test failed

* change is_dynamic structure because of unit test failed at test entry

* add comment for is_dynamic_seed

* add enduro and spaceinvaders mdqn config file && polish comments

* polish code under comment
…input (#451)

* feature(rjy): add rocket env

* feature(rjy): add rocket env and config

* fix(rjy): waiting for cuda fix

* config(rjy): add dmc2gym _sac_state_config

* config(rjy): modify max_env_step

* config(rjy): max_step

* feature(rjy): add qac_pixel model

* fix(rjy): modify qacpixel model

* fix(rjy): waiting to merge

* fix(rjy): change config

* fix(rjy): modify config and qac

* fix(rjy): rm rocket env

* fix(pu): fix seletion of default_model in sac and qac_pixel template model

* polish(pu): delete one redundant linear layer in qac_pixel

* polish(pu): add atari-like wrapper option for dmc2gym_env, add share_conv_encoder and embed_action option for qac_pixel model

* polish(pu): polish dmc swingup sac config and yapf format

* polish(pu): add option of embed_action_density and fix sac when two Q share conv encoder

* polish(rjy): docker part

* test(rjy): add test for QACPixel model

* polish(rjy): modify network para
PaParaZz1 and others added 6 commits December 13, 2024 16:47
* feature(pu): add pong and cartpole ddp config of dqn and onppo

* fix(pu):fix atari_ppo_ddp.py

* polish(pu): polish atari_dqn_ddp.py and  atari_ppo_ddp.py

* polish(pu): polish atari ddp configs
zjowowen and others added 19 commits January 27, 2025 11:34
* Add IQL algo

* Polish IQL Algorithm

* polish iql
* polish(pu): delete unused enable_fast_timestep argument

* polish(pu): delete unused empty lines

* polish(pu): delete unused empty lines

* style(pu): polish comment's format

* style(pu): polish comment's format
* feature(nyz): add rlhf dataset

* fix(nyz): fix import bugs

* feature(nyz): add vision input support and fix bugs

* style(nyz): add comments for rlhf dataset
* test(nyz): polish ppo and add rlhf ppo loss test

* interface(nyz): add naive interface about grpo/rloo

* test&implement(dcy): add unit tests for GRPO and RLOO

- Add test_grpo_rlhf.py for GRPO unit tests
- Add test_rloo_rlhf.py for RLOO unit tests
- Update GRPO implementation
- Update RLOO implementation

* polish(dcy): polish grpo and rloo and test unit

* (dcy) rloo and grpo

* (dcy) redesign avd from reward

* (dcy) Polish style:Use selective log-softmax to reduce peak vram consumption

* (dcy)small changes

* (dcy)git add readme and typing

* (dcy) English comment file name and function name changed
… jericho config (#860)

* feature(pu): adapt to unizero-multitask ddp, and adapt ppo to support jericho config

* polish(pu): polish arguments and docstring

* fix(pu): add wrongly deleted code snippet

* style(pu): yapf format

* style(pu): flake8 format

* polish(pu): polish docstring, add allreduce_with_indicator method

* style(pu): polish type lint

* polish(pu): polish allreduce_with_indicator

* fix(pu): fix wrongly deleted snippet

* fix(pu): fix ding.utils import

* style(pu): flake8 format

* fix(pu): fix action_mask bug in vac

* fix(pu): fix critic_embedding bug in vac, polish some cfgs

* style(pu): yapf format

* fix(pu): fix self._collector_envstep

* style(pu): yapf format
* fix(pu): fix noise layer's usage

* polish(pu): polish comments

* polish(pu): polish noisy_net config

* fix(pu): fix reset_noise bug in noisy_net option

* fix(pu): fix enable_noise bug in rainbow

* style(pu): yapf format

* style(pu): yapf format

* style(pu): flake8 format

* style(pu): yapf format

* polish(pu): polish set_noise_mode when self._cfg.noisy_net is False

* fature(pu): add unittest for noise_linear_layer

---------

Co-authored-by: puyuan <puyuan1996@qq.com>
* feature(wyx): add three KL-divergence variants

* fix bugs and add description for KL-divergence variants

* add a period for each comment line

* fix bugs and add KL-divergence parameter descriptions

* fix flake8 E501 error in ppo.py docstring

* fix trailing whitespace in ppo.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.