ref: https://github.com/meta-pytorch/torchtune/issues/2883 Candidates - torchforge - verl - TRL - prime-rl