Skip to content

[rl] Enable batch-invariant using FSDP to apply mixed precision #2932

Draft
wwwjn wants to merge 1 commit intomainfrom
rl-batch-invariant-test
Draft

[rl] Enable batch-invariant using FSDP to apply mixed precision #2932
wwwjn wants to merge 1 commit intomainfrom
rl-batch-invariant-test

Conversation

@wwwjn
Copy link
Copy Markdown
Contributor

@wwwjn wwwjn commented Apr 10, 2026

Using FSDP to apply mixed precision (bfloat16) to match vllm's computation dtype, and achieve bit-wise identity between trainer and generator.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant