Add _loss_is_scaled_for_ga to allow custom trainers to control gradient accumulation loss scaling#43651
Add _loss_is_scaled_for_ga to allow custom trainers to control gradient accumulation loss scaling#43651abigailtech wants to merge 2 commits intohuggingface:mainfrom
Conversation
|
cc @qgallouedec |
|
If I understand correctly |
yees, thats right. I'd be open to a more ambitious refactor, do you have a specific direction in mind? |
…ol gradient accumulation loss scaling
41c8952 to
cd31c23
Compare
Added a _loss_is_scaled_for_ga property that custom trainers can override to explicitly control gradient accumulation loss scaling. The default implementation preserves backward compatibility. Custom trainers can now simply override this property to return False instead of manipulating model_accepts_loss_kwargs.
Fixes #43604