Skip to content

[zero] Suggests a minor change to confusing variable names in the ZeRO optimizer#183

Merged
hyunwoongko merged 2 commits intoEleutherAI:mainfrom
yhna940:chore/refact-zero-optim
May 25, 2023
Merged

[zero] Suggests a minor change to confusing variable names in the ZeRO optimizer#183
hyunwoongko merged 2 commits intoEleutherAI:mainfrom
yhna940:chore/refact-zero-optim

Conversation

@yhna940
Copy link
Copy Markdown
Contributor

@yhna940 yhna940 commented May 8, 2023

Title

  • [zero] Suggests a minor change to confusing variable names in the ZeRO optimizer

Description

It seems that the variable names related to the mixed precision parameter group do not comprehensively cover its characteristics, so I suggest a few changes. These changes are very trivial, but hopefully they will alleviate some of the confusion for beginners like me.

Currently, the entire parameter group is named fp16_param_groups, and the parts managed by the gpu at the current rank are described as fp32_flat_param_groups_of_current_rank. This state perfectly represents the characteristics when the master weight is a half-tensor or the dtype specified in the __init__method is fp16. In other cases, however, its characteristics do not correspond to the variable it.

I would like to propose an alternative term, working_param and master_param. The term is more closely related to the concept of the mixed-precision training context. Using working_param and master_weight would create a clear distinction between the two types of parameters and help avoid confusion.

To summarize my suggestions:

  • fp16 -> working
  • fp32 -> master

Linked Issues

  • N/A

Reference

@yhna940 yhna940 requested a review from hyunwoongko as a code owner May 8, 2023 14:01
@yhna940 yhna940 self-assigned this May 8, 2023
@yhna940 yhna940 added the ZeRO ZeroRedundancyOptimizer label May 8, 2023
@hyunwoongko hyunwoongko merged commit 4bf13ac into EleutherAI:main May 25, 2023
dyanos pushed a commit that referenced this pull request Jun 8, 2023
…O optimizer (#183)

## Title

- [zero] Suggests a minor change to confusing variable names in the ZeRO
optimizer

## Description

It seems that the variable names related to the mixed precision
parameter group do not comprehensively cover its characteristics, so I
suggest a few changes. These changes are very trivial, but hopefully
they will alleviate some of the confusion for beginners like me.

Currently, the entire parameter group is named `fp16_param_groups`, and
the parts managed by the gpu at the current rank are described as
`fp32_flat_param_groups_of_current_rank`. This state perfectly
represents the characteristics when the master weight is a half-tensor
or the dtype specified in the `__init__`method is fp16. In other cases,
however, its characteristics do not correspond to the variable it.

I would like to propose an alternative term, `working_param` and
`master_param`. The term is more closely related to the concept of the
mixed-precision training context. Using `working_param` and
`master_weight` would create a clear distinction between the two types
of parameters and help avoid confusion.

To summarize my suggestions:
- `fp16` -> `working`
- `fp32` -> `master`


## Linked Issues

- N/A

## Reference

- hpcaitech/ColossalAI#3173
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ZeRO ZeroRedundancyOptimizer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants