Skip to content

feat: Integrate Penguin env#1336

Closed
bxyu-nvidia wants to merge 33 commits intomainfrom
bxyu/integrate-penguin
Closed

feat: Integrate Penguin env#1336
bxyu-nvidia wants to merge 33 commits intomainfrom
bxyu/integrate-penguin

Conversation

@bxyu-nvidia
Copy link
Copy Markdown
Contributor

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Issues

List issues that this PR closes (syntax):

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

bxyu-nvidia and others added 30 commits October 9, 2025 09:46
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
…n-env

Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Brian Yu <bxyu@nvidia.com>
@bxyu-nvidia bxyu-nvidia changed the base branch from main to bxyu/add-penguin-env October 10, 2025 19:04
@bxyu-nvidia bxyu-nvidia marked this pull request as ready for review October 10, 2025 19:10
@bxyu-nvidia bxyu-nvidia requested review from a team as code owners October 10, 2025 19:10
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think eventually these should make their way into examples/configs/recipes/llm which enforces there must be a nightly test for each yaml config

i think it would be good to have one base penguin config, like examples/configs/grpo_penguin.yaml and then all the other ones are based on that. since the environment is slightly different b/c it's penguin and not our openmathinstruct ones, you should probably update the precommit here:

https://github.com/NVIDIA-NeMo/RL/blob/main/.pre-commit-config.yaml#L71-L73

to also enforce minimization of these recipes

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdyt about examples/penguin/run_grpo_penguin.py -> examples/run_grpo_penguin.py? i kind of think it should be higher and more easily discoverable.

i guess also related, wdyt about moving your configs like examples/penguin/grpo_workbench_qwen3_4binstruct.yaml to examples/configs/recipes/llm/grpo_penguin_workbench_qwen3_4binstruct.yaml?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this should go to tests/functional/run_penguin_single_node_sanity_tests.sh and then added to https://github.com/NVIDIA-NeMo/RL/blob/main/tests/functional/L1_Functional_Tests_GPU.sh to get auto picked up by our CI. You probably need to guard this test though so that it gets skipped if penguin doesn't exist

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, does this use 8 gpus? we only have 2 A100 gpus in the CI

Comment thread nemo_rl/algorithms/grpo.py
Comment thread nemo_rl/experience/rollouts.py
Comment thread nemo_rl/experience/rollouts.py
Comment thread nemo_rl/experience/rollouts.py
# generation_config["max_new_tokens"]

penguin_environment = task_to_env["penguin"]
results = ray.get(penguin_environment.run_rollouts.remote(penguin_rows))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to statically type the return results: SomeStruct so the IDE can help us understand how to deal with this object and go-to-definition? It's hard from reading the code to understand what I should expect to be inside "results"

Comment thread nemo_rl/experience/rollouts.py
Comment thread nemo_rl/experience/rollouts.py
Base automatically changed from bxyu/add-penguin-env to main October 13, 2025 19:50
@bxyu-nvidia bxyu-nvidia requested review from a team as code owners October 13, 2025 19:50
@bxyu-nvidia
Copy link
Copy Markdown
Contributor Author

Closed in favor of #1450

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants