Conversation
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
…n-env Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
… bxyu/integrate-penguin
Signed-off-by: Brian Yu <bxyu@nvidia.com>
There was a problem hiding this comment.
i think eventually these should make their way into examples/configs/recipes/llm which enforces there must be a nightly test for each yaml config
i think it would be good to have one base penguin config, like examples/configs/grpo_penguin.yaml and then all the other ones are based on that. since the environment is slightly different b/c it's penguin and not our openmathinstruct ones, you should probably update the precommit here:
https://github.com/NVIDIA-NeMo/RL/blob/main/.pre-commit-config.yaml#L71-L73
to also enforce minimization of these recipes
There was a problem hiding this comment.
wdyt about examples/penguin/run_grpo_penguin.py -> examples/run_grpo_penguin.py? i kind of think it should be higher and more easily discoverable.
i guess also related, wdyt about moving your configs like examples/penguin/grpo_workbench_qwen3_4binstruct.yaml to examples/configs/recipes/llm/grpo_penguin_workbench_qwen3_4binstruct.yaml?
There was a problem hiding this comment.
maybe this should go to tests/functional/run_penguin_single_node_sanity_tests.sh and then added to https://github.com/NVIDIA-NeMo/RL/blob/main/tests/functional/L1_Functional_Tests_GPU.sh to get auto picked up by our CI. You probably need to guard this test though so that it gets skipped if penguin doesn't exist
There was a problem hiding this comment.
btw, does this use 8 gpus? we only have 2 A100 gpus in the CI
| # generation_config["max_new_tokens"] | ||
|
|
||
| penguin_environment = task_to_env["penguin"] | ||
| results = ray.get(penguin_environment.run_rollouts.remote(penguin_rows)) |
There was a problem hiding this comment.
is it possible to statically type the return results: SomeStruct so the IDE can help us understand how to deal with this object and go-to-definition? It's hard from reading the code to understand what I should expect to be inside "results"
|
Closed in favor of #1450 |
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Issues
List issues that this PR closes (syntax):
Usage
# Add a code snippet demonstrating how to use thisBefore your PR is "Ready for review"
Pre checks:
Additional Information