Merge main DI-engine #3

karroyan · 2022-10-24T02:18:36Z

Description

Related Issue

TODO

Check List

merge the latest version source branch/repo, and resolve all the conflicts
pass style check
pass all the tests

* feature(nyz): setup evogym docker * fix(nyz): correct evogym docker

* feature(nyz): add bipedalwalker demo and fix rounding bug * feature(nyz): add seed_api option and ch3 config * fix(nyz): fix env clone bug, add attention policy * polish(nyz): add cloudpickle clone and fix minor bugs * feature(nyz): add evogym env config

…582)

* embed metadrive into di-enginej * adapt to the latest version of metadrive * adapt to metadrive * adapt metadrive feature * change decision frequency * fix some problems * add show bird view observation * add metadrive in readme.md

* Unsqueeze action_args in PDQN when its shape is 1 * Parameterize PDQN unittest

overide -> override (ci skip)

* polish(pu): polish icm_onppo_config * polish(pu): polish icm rnd intrinsic_reward_weight and config * style(pu): yapf format * fix(lisong): fix config bugs and app_key env bugs * polish(lisong): polish icm/rnd config and reward model * fix(lisong): add viewsizerapper in minigrid_wrapper * fix(lisong): add doorkey8x8 rnd+onppo config,save reward model, fix replay save * fix(pu): fix augmented_reward tb_logging * feat(lisong): add noisy-tv env in minigrid * fix(lisong): modify noisy_tv env

* polish ppof rewardclip and add atari config * polish config * polish style * polish config and ppof * polish style * polish yapf style

* draft runable verison for mdqn and config file * fix style for mdqn * fix style for mdqn * update action_gap part for mdqn * provide tau and alpha * draft runable verison for mdqn and config file * fix style for mdqn * fix style for mdqn * update action_gap part for mdqn * provide tau and alpha * add clipfrac to mdqn * add unit test for mdqn td error * provide current exp parameter * fix bug in mdqn td loss function and polish code * revert useless change in dqn * update readme for mdqn * delete wring named folder * rename asterix folder * provide resonable config for asterix * fix style and unit test * polish code under comment * fix typo in dizoo asterix config * fix style * fix style * provide is_dynamic_seed for collector env * add unit test for mdqn in test_serial_entry with asterix * change test for mdqn from asterix to cartpole because of platform test failed * change is_dynamic structure because of unit test failed at test entry * add comment for is_dynamic_seed * add enduro and spaceinvaders mdqn config file && polish comments * polish code under comment

…input (#451) * feature(rjy): add rocket env * feature(rjy): add rocket env and config * fix(rjy): waiting for cuda fix * config(rjy): add dmc2gym _sac_state_config * config(rjy): modify max_env_step * config(rjy): max_step * feature(rjy): add qac_pixel model * fix(rjy): modify qacpixel model * fix(rjy): waiting to merge * fix(rjy): change config * fix(rjy): modify config and qac * fix(rjy): rm rocket env * fix(pu): fix seletion of default_model in sac and qac_pixel template model * polish(pu): delete one redundant linear layer in qac_pixel * polish(pu): add atari-like wrapper option for dmc2gym_env, add share_conv_encoder and embed_action option for qac_pixel model * polish(pu): polish dmc swingup sac config and yapf format * polish(pu): add option of embed_action_density and fix sac when two Q share conv encoder * polish(rjy): docker part * test(rjy): add test for QACPixel model * polish(rjy): modify network para

* feature(pu): add pong and cartpole ddp config of dqn and onppo * fix(pu):fix atari_ppo_ddp.py * polish(pu): polish atari_dqn_ddp.py and atari_ppo_ddp.py * polish(pu): polish atari ddp configs

* Add IQL algo * Polish IQL Algorithm * polish iql

* polish(pu): delete unused enable_fast_timestep argument * polish(pu): delete unused empty lines * polish(pu): delete unused empty lines * style(pu): polish comment's format * style(pu): polish comment's format

* feature(nyz): add rlhf dataset * fix(nyz): fix import bugs * feature(nyz): add vision input support and fix bugs * style(nyz): add comments for rlhf dataset

* test(nyz): polish ppo and add rlhf ppo loss test * interface(nyz): add naive interface about grpo/rloo * test&implement(dcy): add unit tests for GRPO and RLOO - Add test_grpo_rlhf.py for GRPO unit tests - Add test_rloo_rlhf.py for RLOO unit tests - Update GRPO implementation - Update RLOO implementation * polish(dcy): polish grpo and rloo and test unit * (dcy) rloo and grpo * (dcy) redesign avd from reward * (dcy) Polish style：Use selective log-softmax to reduce peak vram consumption * (dcy)small changes * (dcy)git add readme and typing * (dcy) English comment file name and function name changed

… jericho config (#860) * feature(pu): adapt to unizero-multitask ddp, and adapt ppo to support jericho config * polish(pu): polish arguments and docstring * fix(pu): add wrongly deleted code snippet * style(pu): yapf format * style(pu): flake8 format * polish(pu): polish docstring, add allreduce_with_indicator method * style(pu): polish type lint * polish(pu): polish allreduce_with_indicator * fix(pu): fix wrongly deleted snippet * fix(pu): fix ding.utils import * style(pu): flake8 format * fix(pu): fix action_mask bug in vac * fix(pu): fix critic_embedding bug in vac, polish some cfgs * style(pu): yapf format * fix(pu): fix self._collector_envstep * style(pu): yapf format

* fix(pu): fix noise layer's usage * polish(pu): polish comments * polish(pu): polish noisy_net config * fix(pu): fix reset_noise bug in noisy_net option * fix(pu): fix enable_noise bug in rainbow * style(pu): yapf format * style(pu): yapf format * style(pu): flake8 format * style(pu): yapf format * polish(pu): polish set_noise_mode when self._cfg.noisy_net is False * fature(pu): add unittest for noise_linear_layer --------- Co-authored-by: puyuan <puyuan1996@qq.com>

* feature(wyx): add three KL-divergence variants * fix bugs and add description for KL-divergence variants * add a period for each comment line * fix bugs and add KL-divergence parameter descriptions * fix flake8 E501 error in ppo.py docstring * fix trailing whitespace in ppo.py

PaParaZz1 force-pushed the main branch from d66b42b to c50379c Compare December 13, 2022 17:10

PaParaZz1 and others added 8 commits February 12, 2023 20:14

feature(nyz): setup evogym docker (#580)

d489313

* feature(nyz): setup evogym docker * fix(nyz): correct evogym docker

fix(nyz): fix CkptSaver and env manager interface compatibility bug (#…

7f2e36e

…582)

style(nyz): extend treetensor lowest version(ci skip)

bd46c7d

fix(nyz): fix gym env deepcopy spec bug

8b1f05b

fix(nyz): fix ppof collect_data and deploy cuda mismatch bug

f1f0b55

v0.4.6

c11f052

PaParaZz1 force-pushed the main branch from 76b1a11 to c11f052 Compare February 18, 2023 12:40

PaParaZz1 and others added 20 commits February 18, 2023 22:01

style(nyz): fix v0.4.6 version id bug(ci skip)

0b4180c

fix(nyz): fix deque buffer wrapper PER bug (#586)

1e6f503

style(nyz): add d4rl docker (#591)

f3b5a67

fix(nyz): fix evaluator return_info tensor type bug (#592)

e89fb6b

style(nyz): update introduction(ci skip)

6824669

style(nyz): polish readme and add treetensor example(ci skip)

20cf318

style(nyz): add diff example for treetensor(ci skip)

203be4b

style(nyz): fix typos and polish ci deploy(ci skip)

737af2f

fix(psharold): unsqueeze action_args in PDQN when shape is 1 (#599)

8c33420

* Unsqueeze action_args in PDQN when its shape is 1 * Parameterize PDQN unittest

fix(nyz): update ptz to latest version (#597)

2c8d02e

style(elt): fix typo in time_helper.py (#602)

ac08231

overide -> override (ci skip)

fix(nyz): fix reward model save method compatibility bug

3b108aa

feature(lxy): modify ppof rewardclip and add atari config (#589)

b7ce258

* polish ppof rewardclip and add atari config * polish config * polish style * polish config and ppof * polish style * polish yapf style

polish(nyz): polish comment and clean code about SAC

67032d7

fix(nyz): fix SAC old value network notation bug

d72df0d

style(nyz): update doc links

55898a3

polish(nyz): polish QAC with ConvEncoder

b81ce53

PaParaZz1 added 3 commits December 9, 2024 20:44

fix(nyz): fix many unittest bugs

765b8fb

fix(nyz): fix mock and config bugs

aa86aa7

fix(nyz): fix wandb requirements bug

317e775

PaParaZz1 force-pushed the main branch from 5a1814f to 317e775 Compare December 13, 2024 07:54

PaParaZz1 and others added 6 commits December 13, 2024 16:47

fix(nyz): fix rmsprop bug in torch 1.13.1

580ea65

feature(pu): add ddp config of dqn and onppo (#842)

9a6e46f

* feature(pu): add pong and cartpole ddp config of dqn and onppo * fix(pu):fix atari_ppo_ddp.py * polish(pu): polish atari_dqn_ddp.py and atari_ppo_ddp.py * polish(pu): polish atari ddp configs

v0.5.3

f60b377

fix(nyz): fix env check bugs (#852)

f5157c7

fix(nyz): fix env check multi-discrete bug (#852)

4e92de5

test(nyz): upgrade python version and setup-python version

bf258f8

PaParaZz1 force-pushed the main branch from c9723c9 to bf258f8 Compare January 24, 2025 04:27

zjowowen and others added 19 commits January 27, 2025 11:34

feature(zjow): add Implicit Q-Learning (#821)

dae7673

* Add IQL algo * Polish IQL Algorithm * polish iql

style(nyz): fix flake8 code style (ci skip)

3292384

polish(pu): delete unused enable_fast_timestep argument (#855)

64efcb3

* polish(pu): delete unused enable_fast_timestep argument * polish(pu): delete unused empty lines * polish(pu): delete unused empty lines * style(pu): polish comment's format * style(pu): polish comment's format

feature(nyz): add rlhf dataset (#854)

abcf972

* feature(nyz): add rlhf dataset * fix(nyz): fix import bugs * feature(nyz): add vision input support and fix bugs * style(nyz): add comments for rlhf dataset

style(nyz): polish rl_utils style details (ci skip)

6c2ca2f

style(nyz): update atari link (ci skip)

b771e96

fix(nyz): fix docker deploy cache actions bug

605b457

style(nyz): add rust installation in docker

101d586

demo(nyz): add ppo lunarlander continuous example

c290a67

style(nyz): fix flake8 style(ci skip)

f6ee768

doc(nyz): create SECURITY.md

f78aed1

fix(nyz): fix multi-machine gpu id bug (#875)

8d29a32

fix(nyz): fix ppo logit pretrained compatibility bugs

1854e58

fix(nyz): fix unittest compatibility bugs

aa780e6

doc(nyz): disable doc docker

d0b21d0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge main DI-engine #3

Merge main DI-engine #3

Uh oh!

karroyan commented Oct 24, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Merge main DI-engine #3

Are you sure you want to change the base?

Merge main DI-engine #3

Uh oh!

Conversation

karroyan commented Oct 24, 2022

Description

Related Issue

TODO

Check List

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants