Skip to content

[chatgpt] Detached PPO Training#3195

Merged
ver217 merged 48 commits intohpcaitech:mainfrom
CsRic:detached_ppo
Apr 17, 2023
Merged

[chatgpt] Detached PPO Training#3195
ver217 merged 48 commits intohpcaitech:mainfrom
CsRic:detached_ppo

Conversation

@CsRic
Copy link
Copy Markdown
Contributor

@CsRic CsRic commented Mar 21, 2023

📌 Checklist before creating the PR

  • I have created an issue for this PR for traceability
  • The title follows the standard format: [doc/gemini/tensor/...]: A concise description
  • I have added relevant tags if possible for us to better distinguish different PRs

🚨 Issue number

Link this PR to your issue with words like fixed to automatically close the linked issue upon merge

e.g. fixed #1234, closed #1234, resolved #1234

📝 What does this PR do?

Summarize your work here.
if you have any plots/diagrams/screenshots/tables, please attach them here.

'Detached' PPO training means that the experience makers and trainers are splitted into different nodes for asynchronous training. Models are not shared.

  1. Propose several classes for 'detached' manner: ExperienceMakerHolder, DetachedReplayBuffer, DetachedPPOTrainer
  2. Implement Ray for Detached workflow structure
  3. Examples: 1m1t. 1m2t, 2m1t, 2m2t.
  4. Supported Strategies: Naive, DDP.
  5. Won't affect present code.

Known issues:
1. Cannot detect cuda device on each worker. fixed 20230324
2. Cannot run with Colossal strategy.
3. correctness of 1m1t.py example. fixed 20230324

TODO:

  • Implement parameter update from trainer to experience maker
  • fix issues
  • Add TP strategy for experience maker and trainer.
  • Support multiple nodes.

💥 Checklist before requesting a review

  • I have linked my PR to an issue (instruction)
  • My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
  • I have performed a self-review of my code
  • I have added thorough tests.
  • I have added docstrings for all the functions/methods I implemented

⭐️ Do you enjoy contributing to Colossal-AI?

  • 🌝 Yes, I do.
  • 🌚 No, I don't.

Tell us more if you don't enjoy contributing to Colossal-AI.

Comment thread applications/ChatGPT/examples/train_prompts.sh Outdated
Comment thread applications/ChatGPT/chatgpt/experience_maker/detached.py Outdated
Comment thread applications/ChatGPT/chatgpt/replay_buffer/detached.py Outdated
@binmakeswell
Copy link
Copy Markdown
Member

Hi @CsRic Thanks for your contribution, but there is a conflict in this PR. Could you please solve them first? Thanks.
image

@ver217
Copy link
Copy Markdown
Contributor

ver217 commented Apr 13, 2023

Can you move all files to coati/ray?

@CsRic
Copy link
Copy Markdown
Contributor Author

CsRic commented Apr 13, 2023

Can you move all files to coati/ray?

Done

Comment thread applications/Chat/coati/experience_maker/strategy/base.py Outdated
Comment thread applications/Chat/examples/train_prompts.sh Outdated
Comment thread applications/Chat/examples/train_dummy.sh Outdated
Comment thread applications/Chat/coati/trainer/utils.py Outdated
Comment thread applications/Chat/coati/ray/1m1t.py
@ver217 ver217 changed the title [chatgpt] Detached PPO Training Draft [chatgpt] Detached PPO Training Apr 17, 2023
@ver217 ver217 merged commit e355144 into hpcaitech:main Apr 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants