Currently we don't support batch_size<dp_size and batch_size % shards != 0. We should relax this so that users do not have to recreate the vllm engine or the policy if they have an odd batch_size.
This leads to a UX where we have to tear down the worker groups each time we want a different batch size which isn't necessary.
Examples:
Currently we don't support
batch_size<dp_sizeandbatch_size % shards != 0. We should relax this so that users do not have to recreate the vllm engine or the policy if they have an odd batch_size.This leads to a UX where we have to tear down the worker groups each time we want a different batch size which isn't necessary.
Examples: