Enable Sequence Parallelism by polisettyvarma · Pull Request #429 · deepspeedai/Megatron-DeepSpeed

polisettyvarma · 2024-07-23T13:10:37Z

No description provided.

polisettyvarma · 2024-08-09T07:55:34Z

@samadejacobs @tjruwase can you please review this to proceed further ?

polisettyvarma · 2024-08-19T06:30:01Z

@samadejacobs @tjruwase please review this.

polisettyvarma · 2024-08-25T05:46:43Z

@tjruwase @loadams can someone review this ?

polisettyvarma · 2024-08-29T12:55:08Z

@tjruwase Thanks for the review, please check my replies to your comments.

polisettyvarma · 2024-09-03T14:11:52Z

@tjruwase i missed your reply, sorry for the late response. please check my comment

polisettyvarma · 2024-09-03T15:53:10Z

@tjruwase please review now

polisettyvarma · 2024-09-04T04:45:37Z

@tjruwase it's approved but not merged yet, any reason ?

polisettyvarma · 2024-09-04T14:05:16Z

@tjruwase Thanks for merging. I have query regarding hpu specific changes like creating custom bash run scripts for hpu under examples_deepsped/hpu folder. is that okay ?

tjruwase · 2024-09-04T14:12:05Z

@polisettyvarma, yes that seems reasonable.

ys950902 · 2024-09-24T11:38:31Z

Hi @polisettyvarma, this pr will cause init error for rmsnorm init in torch implementation like below:
[rank0]: self.input_layernorm = RMSNorm(config.hidden_size, config.layernorm_epsilon,
[rank0]: TypeError: RMSNorm.init() got an unexpected keyword argument 'sequence_parallel'

I have raised the pr to fix #448, is it okay for you?

Signed-off-by: Logan Adams <loadams@microsoft.com>

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

…nabled (#479) * pass batch_dim_idx to deepspeed sequence parallel distributed attention for supporting batch size larger than 1 Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * add fused_rms_norm support on XPU device (#431) Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * [LLaMa] Adding support converting checkpoint from mds to hf (#432) * add support converting checkpoint from hf to mds * Fix PP issue * update Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * add device check when import ipex (#436) Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * fix TFLOPs calculation (#371) * fix TFLOPs calculation when GQA used, we observe right TFLOPs after this fix. when GQA is not used, huge difference in TFLOPs is solved with selective recompute . some other minor difference will also be observed as logits macs also added. * add copyrights Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * fix nan issue when running megatron-deepspeed (#434) Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * enable empty cache on XPU device (#438) Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * [wandb] disable wandb more gracefully (#422) Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * [Bug] Fix crash when logging optimizer state to tb (#417) Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * add FPDT support; add Ulysses rotary position embedding support Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * add FPDT support; add Ulysses rotary position embedding support Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * add FPDT support; add Ulysses rotary position embedding support Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * add FPDT support; add Ulysses rotary position embedding support Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * remove unnecessary files Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * set the warmup length to be FPDT chunk size if enabled Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * Enable Sequence Parallelism (#429) Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * grad_wei can't be NoneType when running with DeepSpeed, for zero3 will divided the gradient (#428) Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * fix init issue for rms_norm in squence_parallel (#448) Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * enable profiler for specific ranks (#451) Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * fix init issue for silently ignoring the deepspeed config (#452) Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * fix moe tflops (#445) Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * [tool]GQA convert support (#454) * [tools]GQA convert support * fix readme Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * Fix import error in `deepspeed_to_megatron.py` (#455) Previously, `deepspeed_to_megatron.py` would raise an import error due to the relative import. This commit fixes this issue by changing from the relative import to the absolute import like in `deepspeed_to_transformers.py`. Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * Update references to new GitHub org (deepspeedai) (#462) Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * add sequence_parallel in layernorm init to enable 3D parallelism can run successfully with DeepSpeed (#468) Signed-off-by: yisheng <yi.sheng@intel.com> Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> * fix bug when FPDT is disabled but with original Ulysses Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> Signed-off-by: jinghan yao yjhmitweb@gmail.com Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> --------- Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu> Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: yisheng <yi.sheng@intel.com> Signed-off-by: jinghan yao yjhmitweb@gmail.com Co-authored-by: Jinghan Yao <yjhmitweb@ascend-rw02.ten.osc.edu> Co-authored-by: YiSheng5 <syhm@mail.ustc.edu.cn> Co-authored-by: billishyahao <yahao.he@gmail.com> Co-authored-by: Polisetty V R K Jyothendra Varma <jvarma@habana.ai> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Jinghan Yao <yjhmitweb@ascend-rw01.ten.osc.edu> Co-authored-by: ranzhejiang <zhejiang.ran@intel.com> Co-authored-by: Xinyu Lian <lian7@illinois.edu> Co-authored-by: inkcherry <mingzhi.liu@intel.com> Co-authored-by: hotsuyuki <hotsuyuki.kawanishi@gmail.com> Co-authored-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

polisettyvarma added 13 commits July 23, 2024 16:10

Update arguments.py

7554fb0

Update layers.py

46dabf9

Update fused_layer_norm.py

dbc64a3

Update gpt_model.py

5de04d3

Update language_model.py

54b13d8

Update rmsnorm.py

9618f21

Update transformer.py

de5adcf

Update utils.py

7bcd361

Update layers.py

6a167e5

Update fused_layer_norm.py

520ac3a

Update gpt_model.py

04c1909

Update language_model.py

c9aec04

Update utils.py

2947fa2

polisettyvarma requested review from GuanhuaWang, arashb, awan-10, duli2012, eltonzheng and tjruwase as code owners July 23, 2024 13:10

iamdeepakgit approved these changes Jul 24, 2024

View reviewed changes

tjruwase requested review from samadejacobs and removed request for GuanhuaWang, arashb, awan-10, duli2012, eltonzheng and tjruwase July 26, 2024 14:01

samadejacobs reviewed Jul 29, 2024

View reviewed changes

Comment thread megatron/model/language_model.py

tjruwase reviewed Aug 26, 2024

View reviewed changes

Comment thread megatron/model/fused_layer_norm.py

tjruwase reviewed Aug 26, 2024

View reviewed changes

Comment thread megatron/model/rmsnorm.py

tjruwase reviewed Aug 26, 2024

View reviewed changes

Comment thread megatron/model/utils.py Outdated

Update utils.py

b436b90

polisettyvarma added 2 commits September 3, 2024 21:19

Update fused_layer_norm.py

faa0d74

Update rmsnorm.py

003fb7b

tjruwase approved these changes Sep 3, 2024

View reviewed changes

tjruwase merged commit 0d6e379 into deepspeedai:main Sep 4, 2024

polisettyvarma deleted the sequence_parallelism branch September 4, 2024 10:56

ys950902 mentioned this pull request Sep 29, 2024

[Bug]Fix init issue for layer_norm in sequence_parallel for non-CUDA device. #450

Closed

loadams pushed a commit that referenced this pull request Feb 7, 2025

Enable Sequence Parallelism (#429)

c124896

Signed-off-by: Logan Adams <loadams@microsoft.com>

YJHMITWEB pushed a commit to YJHMITWEB/Megatron-DeepSpeed that referenced this pull request Aug 9, 2025

Enable Sequence Parallelism (deepspeedai#429)

443a872

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

YJHMITWEB pushed a commit to YJHMITWEB/Megatron-DeepSpeed that referenced this pull request Aug 9, 2025

Enable Sequence Parallelism (deepspeedai#429)

c465a91

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable Sequence Parallelism#429

Enable Sequence Parallelism#429
tjruwase merged 16 commits intodeepspeedai:mainfrom
polisettyvarma:sequence_parallelism

polisettyvarma commented Jul 23, 2024

Uh oh!

Uh oh!

polisettyvarma commented Aug 9, 2024

Uh oh!

polisettyvarma commented Aug 19, 2024

Uh oh!

polisettyvarma commented Aug 25, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

polisettyvarma commented Aug 29, 2024

Uh oh!

polisettyvarma commented Sep 3, 2024

Uh oh!

polisettyvarma commented Sep 3, 2024

Uh oh!

polisettyvarma commented Sep 4, 2024

Uh oh!

polisettyvarma commented Sep 4, 2024

Uh oh!

tjruwase commented Sep 4, 2024

Uh oh!

ys950902 commented Sep 24, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

polisettyvarma commented Jul 23, 2024

Uh oh!

Uh oh!

polisettyvarma commented Aug 9, 2024

Uh oh!

polisettyvarma commented Aug 19, 2024

Uh oh!

polisettyvarma commented Aug 25, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

polisettyvarma commented Aug 29, 2024

Uh oh!

polisettyvarma commented Sep 3, 2024

Uh oh!

polisettyvarma commented Sep 3, 2024

Uh oh!

polisettyvarma commented Sep 4, 2024

Uh oh!

polisettyvarma commented Sep 4, 2024

Uh oh!

tjruwase commented Sep 4, 2024

Uh oh!

ys950902 commented Sep 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ys950902 commented Sep 24, 2024 •

edited

Loading