Skip to content

Refactor: separate sync/async vllm #599

@yuki-97

Description

@yuki-97

For now, vllm.py is a bit complex with both sync/async code in it.

It's better to separate them to improve the experience of user and developer.

step1:
after doing this, we can aware the code changes of vllm at main branch and not miss them when doing step2.

  1. put vllm stuffs from nemo_rl/models/generation to nemo_rl/models/generation/vllm, so that it's easy for us to support other inference FW in the future.
  2. split sync and async vllm worker to different files to make it clear.

step2:

  1. tidy and remove duplicated / useless code.

Metadata

Metadata

Assignees

Labels

UXRelated to user experiencevllm

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions