Add CPUAdam optimizer for zero-offload in deepspeed engine by RezaYazdaniAminabadi · Pull Request #484 · deepspeedai/DeepSpeed

RezaYazdaniAminabadi · 2020-10-23T18:22:32Z

No description provided.

deepspeed/ops/adam/cpu_adam.py

* Merge chatgpt v2 to v3 - finalized (#484) * [squash] staging chatgpt v1 (#463) Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: yaozhewei <zheweiy@berkeley.edu> Co-authored-by: Tunji Ruwase <olruwase@microsoft.com> * [partial] formatting fixes * quantizer fixes * fix for bert tests * formatting fixes * re-enable _param_slice_mappings in z2 * Enable the QKV requires_grad when in training mode (#466) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> * fixes for attention enable_training flag * commit to trigger CI * fix for distil-bert param * fixes for training context errors * remove reza's qkv-optimization (#469) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> * Chatgpt - Fuse lora params at HybridEngine (#472) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> * add option to enable non-pin mode (#473) * Chatgpt - fuse lora non pinned case (#474) * Fix fuse/unfuse lora for Z3 and non-pinned parameter * unfuse_lora_weight for non-pinned case * fix the multiple issue for lora parameters * formatting * fuse lora only when available --------- Co-authored-by: Jeff Rasley <jerasley@microsoft.com> * Chatgpt/release inference cache (#475) * Fix fuse/unfuse lora for Z3 and non-pinned parameter * unfuse_lora_weight for non-pinned case * release/retake the inference cache after/before generate * remove duplicated _fuse_lora function * fix formatting * fix hybrid-engine config issue * update formatting * Chatgpt - fuse qkv v2 (#478) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> * ChatGPT: Refactor Hybrid Engine Config (#477) Co-authored-by: Lok Chand Koppaka <lokoppak@microsoft.com> * Inference Workspace Tweaks (#481) * Safety checks around inference workspace allocation, extra flushing * Formatting fixes * Merge fix * Chatgpt/inference tp (#480) * Update the merged-QKV weights only if there is difference with the model parameter * remove the hard-coded size * always reset qkv params to updated ones after running step * Add the infernce-tp group and tensor sharding to run inference in model-parallel mode * optimize the gather/mp-sharding part * Add hybrid_engine changes * fix config issue * Formatting fixes. Reset_qkv duplicate removal. * fix bloom container. * fix format. --------- Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: Lok Chand Koppaka <lokoppak@microsoft.com> * fix formatting * more clean-up --------- Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: yaozhewei <zheweiy@berkeley.edu> Co-authored-by: Tunji Ruwase <olruwase@microsoft.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: Lok Chand Koppaka <lokoppak@microsoft.com> Co-authored-by: Connor Holmes <connorholmes@microsoft.com> Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com> * fix a bug on lora-fusion (#487) * Cholmes/v3 workspace bugfixes (#488) * Miscellaneous workspace fixes, new config param * Fix typo --------- Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: yaozhewei <zheweiy@berkeley.edu> Co-authored-by: Tunji Ruwase <olruwase@microsoft.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: Lok Chand Koppaka <lokoppak@microsoft.com> Co-authored-by: Connor Holmes <connorholmes@microsoft.com>

Reza Yazdani added 2 commits October 22, 2020 17:22

add adamW to CPU-ADAM implementation

6faa75b

supporting cpu-adam optimizer for zero-offload on deepspeed side

283da00

RezaYazdaniAminabadi requested review from ShadenSmith, arashashari, awan-10, cli99, conglongli, eltonzheng, jeffra, minjiaz, niumanar, samyam and tjruwase as code owners October 23, 2020 18:22

add adamW and deepspeed-adamw mode

e2f204a

jeffra approved these changes Oct 24, 2020

View reviewed changes

address comment about bug

65f7141

jeffra reviewed Oct 29, 2020

View reviewed changes

deepspeed/ops/adam/cpu_adam.py Show resolved Hide resolved

jeffra and others added 6 commits October 29, 2020 23:55

update with agreed upon cpu-adam engine logic

d9df239

remove tabs

e5d50e1

add doc-string for the cpu-adam optimizer

cf16a64

fixing typo

26a28fd

updating config json documentation

7f85890

bump DSE to match cpu-adam updates

e0ca443

jeffra merged commit f5aa254 into master Oct 30, 2020

mrwyattii deleted the reyazda/cpu-offload-optimizer branch July 7, 2023 02:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CPUAdam optimizer for zero-offload in deepspeed engine#484

Add CPUAdam optimizer for zero-offload in deepspeed engine#484
jeffra merged 10 commits intomasterfrom
reyazda/cpu-offload-optimizer

RezaYazdaniAminabadi commented Oct 23, 2020

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

RezaYazdaniAminabadi commented Oct 23, 2020

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants