Support un-even dispatches

Currently we don't support `batch_size<dp_size` and `batch_size % shards != 0`. We should relax this so that users do not have to recreate the vllm engine or the policy if they have an odd batch_size. 

This leads to a UX where we have to tear down the worker groups each time we want a different batch size which isn't necessary.

Examples:
* vllm: https://github.com/NVIDIA/reinforcer/blob/d9277a8afcc2f4892ebe7225754b94b586de09d9/nemo_reinforcer/models/generation/vllm.py#L542
* hf: https://github.com/NVIDIA/reinforcer/blob/d9277a8afcc2f4892ebe7225754b94b586de09d9/nemo_reinforcer/models/policy/hf_policy.py#L934

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support un-even dispatches #125

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support un-even dispatches #125

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions