Skip to content

[shardformer] write an shardformer example with bert finetuning#4126

Merged
FrankLeeeee merged 2 commits intohpcaitech:feature/shardformerfrom
flybird11111:feature/shardformer
Jun 30, 2023
Merged

[shardformer] write an shardformer example with bert finetuning#4126
FrankLeeeee merged 2 commits intohpcaitech:feature/shardformerfrom
flybird11111:feature/shardformer

Conversation

@flybird11111
Copy link
Copy Markdown
Contributor

📌 Checklist before creating the PR

  • I have created an issue for this PR for traceability
  • The title follows the standard format: [doc/gemini/tensor/...]: A concise description
  • I have added relevant tags if possible for us to better distinguish different PRs

🚨 Issue number

Link this PR to your issue with words like fixed to automatically close the linked issue upon merge

e.g. fixed #1234, closed #1234, resolved #1234

#4110

📝 What does this PR do?

Summarize your work here.
if you have any plots/diagrams/screenshots/tables, please attach them here.

write an shardformer example with bert finetuning

💥 Checklist before requesting a review

  • I have linked my PR to an issue (instruction)
  • My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
  • I have performed a self-review of my code
  • I have added thorough tests.
  • I have added docstrings for all the functions/methods I implemented

⭐️ Do you enjoy contributing to Colossal-AI?

  • 🌝 Yes, I do.
  • 🌚 No, I don't.

Tell us more if you don't enjoy contributing to Colossal-AI.

@flybird11111 flybird11111 added the example example-related issuer or pull request label Jun 30, 2023
@github-actions
Copy link
Copy Markdown
Contributor

The code coverage for the changed files is 84%.

Click me to view the complete report
Name                                                                                       Stmts   Miss  Cover
--------------------------------------------------------------------------------------------------------------
colossalai/auto_parallel/tensor_shard/node_handler/node_handler.py                           164     82    50%
colossalai/auto_parallel/tensor_shard/node_handler/strategy/matmul_strategy_generator.py     388     76    80%
colossalai/auto_parallel/tensor_shard/utils/misc.py                                           45      9    80%
colossalai/checkpoint_io/utils.py                                                            243     43    82%
colossalai/device/device_mesh.py                                                             178     14    92%
colossalai/lazy/lazy_init.py                                                                 299     40    87%
colossalai/nn/layer/base_layer.py                                                             36     15    58%
colossalai/nn/layer/parallel_1d/_operation.py                                                 53     26    51%
colossalai/shardformer/__init__.py                                                             1      0   100%
colossalai/shardformer/_utils.py                                                              42     15    64%
colossalai/shardformer/layer/__init__.py                                                       7      0   100%
colossalai/shardformer/layer/_operation.py                                                   152     51    66%
colossalai/shardformer/layer/dropout.py                                                       35      0   100%
colossalai/shardformer/layer/embedding.py                                                    118      3    97%
colossalai/shardformer/layer/linear.py                                                       156     23    85%
colossalai/shardformer/layer/loss.py                                                          49      8    84%
colossalai/shardformer/layer/normalization.py                                                 50     23    54%
colossalai/shardformer/layer/parallel_module.py                                               76     21    72%
colossalai/shardformer/layer/qkv_fused_linear.py                                             196     29    85%
colossalai/shardformer/layer/utils.py                                                         81     10    88%
colossalai/shardformer/model/__init__.py                                                       0      0   100%
colossalai/shardformer/model/modeling_bert.py                                                 20     20     0%
colossalai/shardformer/policies/__init__.py                                                    0      0   100%
colossalai/shardformer/policies/autopolicy.py                                                 27      2    93%
colossalai/shardformer/policies/basepolicy.py                                                 46      6    87%
colossalai/shardformer/policies/bert.py                                                      117      2    98%
colossalai/shardformer/policies/bloom.py                                                      74      7    91%
colossalai/shardformer/policies/gpt2.py                                                       66      3    95%
colossalai/shardformer/policies/llama.py                                                      37      3    92%
colossalai/shardformer/policies/opt.py                                                        40      2    95%
colossalai/shardformer/policies/t5.py                                                         30      3    90%
colossalai/shardformer/policies/vit.py                                                        23     23     0%
colossalai/shardformer/shard/__init__.py                                                       4      0   100%
colossalai/shardformer/shard/shard_config.py                                                  19      2    89%
colossalai/shardformer/shard/sharder.py                                                       80      9    89%
colossalai/shardformer/shard/shardformer.py                                                   13      0   100%
colossalai/tensor/comm_spec.py                                                               253     93    63%
colossalai/tensor/d_tensor/__init__.py                                                         4      0   100%
colossalai/tensor/d_tensor/api.py                                                            136     18    87%
colossalai/tensor/d_tensor/comm_spec.py                                                      151     35    77%
colossalai/tensor/d_tensor/layout.py                                                          38      1    97%
colossalai/tensor/d_tensor/layout_converter.py                                               195     12    94%
colossalai/tensor/d_tensor/utils.py                                                           38      7    82%
colossalai/tensor/shape_consistency.py                                                       294    120    59%
colossalai/tensor/sharding_spec.py                                                           139     13    91%
colossalai/testing/__init__.py                                                                 4      0   100%
colossalai/testing/comparison.py                                                              54      9    83%
tests/kit/model_zoo/registry.py                                                               17      0   100%
tests/kit/model_zoo/transformers/__init__.py                                                   7      0   100%
tests/kit/model_zoo/transformers/bert.py                                                      42      0   100%
tests/kit/model_zoo/transformers/bloom.py                                                     34      0   100%
tests/kit/model_zoo/transformers/gpt.py                                                       28      0   100%
tests/kit/model_zoo/transformers/llama.py                                                     26      2    92%
tests/kit/model_zoo/transformers/opt.py                                                       32      0   100%
tests/kit/model_zoo/transformers/t5.py                                                        24      0   100%
tests/test_autochunk/test_autochunk_diffuser/test_autochunk_unet.py                           36     10    72%
tests/test_booster/test_mixed_precision/test_fp16_torch.py                                    30      1    97%
tests/test_booster/test_plugin/test_gemini_plugin.py                                          74     10    86%
tests/test_booster/test_plugin/test_low_level_zero_plugin.py                                  60      6    90%
tests/test_booster/test_plugin/test_torch_ddp_plugin.py                                       78      0   100%
tests/test_booster/test_plugin/test_torch_fsdp_plugin.py                                      43      0   100%
tests/test_checkpoint_io/test_gemini_checkpoint_io.py                                         80      0   100%
tests/test_device/test_device_mesh.py                                                         58     36    38%
tests/test_device/test_init_logical_pg.py                                                     27      1    96%
tests/test_fx/test_tracer/test_hf_model/hf_tracer_utils.py                                    21      2    90%
tests/test_fx/test_tracer/test_hf_model/test_hf_albert.py                                     17      1    94%
tests/test_fx/test_tracer/test_hf_model/test_hf_bert.py                                       15      1    93%
tests/test_fx/test_tracer/test_hf_model/test_hf_diffuser.py                                   50     28    44%
tests/test_fx/test_tracer/test_hf_model/test_hf_gpt.py                                        17      1    94%
tests/test_fx/test_tracer/test_hf_model/test_hf_opt.py                                        15      1    93%
tests/test_fx/test_tracer/test_hf_model/test_hf_t5.py                                         17      1    94%
tests/test_fx/test_tracer/test_timm_model/test_timm_model.py                                  36     24    33%
tests/test_fx/test_tracer/test_torchaudio_model/test_torchaudio_model.py                      14      5    64%
tests/test_fx/test_tracer/test_torchrec_model/test_deepfm_model.py                            39      3    92%
tests/test_fx/test_tracer/test_torchrec_model/test_dlrm_model.py                              41      4    90%
tests/test_fx/test_tracer/test_torchvision_model/test_torchvision_model.py                    31      1    97%
tests/test_lazy/lazy_init_utils.py                                                            72     14    81%
tests/test_lazy/test_distribute.py                                                            73      3    96%
tests/test_lazy/test_models.py                                                                13      1    92%
tests/test_shardformer/__init__.py                                                             0      0   100%
tests/test_shardformer/test_layer/test_dist_crossentropy.py                                   27      1    96%
tests/test_shardformer/test_layer/test_dropout.py                                             42      1    98%
tests/test_shardformer/test_layer/test_embedding.py                                           30      1    97%
tests/test_shardformer/test_layer/test_layernorm.py                                           27      1    96%
tests/test_shardformer/test_layer/test_linear_1d.py                                           85      1    99%
tests/test_shardformer/test_layer/test_qkv_fused_linear_1d.py                                 73      1    99%
tests/test_shardformer/test_layer/test_vocab_parallel_embedding_1d.py                         32      1    97%
tests/test_shardformer/test_model/__init__.py                                                  0      0   100%
tests/test_shardformer/test_model/_utils.py                                                   21      0   100%
tests/test_shardformer/test_model/test_shard_bert.py                                          37      1    97%
tests/test_shardformer/test_model/test_shard_bloom.py                                         37      1    97%
tests/test_shardformer/test_model/test_shard_gpt2.py                                          37      1    97%
tests/test_shardformer/test_model/test_shard_llama.py                                         41      1    98%
tests/test_shardformer/test_model/test_shard_opt.py                                           40      0   100%
tests/test_shardformer/test_model/test_shard_t5.py                                            35      1    97%
tests/test_shardformer/test_model/test_shard_vit.py                                           34     14    59%
tests/test_shardformer/test_with_torch_ddp.py                                                 46      2    96%
tests/test_tensor/test_dtensor/test_comm_spec.py                                              78      1    99%
tests/test_tensor/test_dtensor/test_dtensor.py                                                65      5    92%
tests/test_tensor/test_dtensor/test_layout_converter.py                                       91      1    99%
tests/test_tensor/test_shape_consistency.py                                                   50      2    96%
tests/test_tensor/test_sharded_linear.py                                                     130      1    99%
tests/test_tensor/test_sharding_spec.py                                                       13      1    92%
--------------------------------------------------------------------------------------------------------------
TOTAL                                                                                       6509   1073    84%

@github-actions
Copy link
Copy Markdown
Contributor

The code coverage for the changed files is 61%.

Click me to view the complete report
Name                                                                                       Stmts   Miss  Cover
--------------------------------------------------------------------------------------------------------------
colossalai/auto_parallel/tensor_shard/node_handler/node_handler.py                           164    164     0%
colossalai/auto_parallel/tensor_shard/node_handler/strategy/matmul_strategy_generator.py     388    388     0%
colossalai/auto_parallel/tensor_shard/utils/misc.py                                           45     45     0%
colossalai/checkpoint_io/utils.py                                                            243    170    30%
colossalai/device/device_mesh.py                                                             178     41    77%
colossalai/lazy/lazy_init.py                                                                 299     40    87%
colossalai/nn/layer/base_layer.py                                                             36     22    39%
colossalai/nn/layer/parallel_1d/_operation.py                                                 53     35    34%
colossalai/shardformer/__init__.py                                                             1      0   100%
colossalai/shardformer/_utils.py                                                              42     15    64%
colossalai/shardformer/layer/__init__.py                                                       7      0   100%
colossalai/shardformer/layer/_operation.py                                                   152     63    59%
colossalai/shardformer/layer/dropout.py                                                       35      0   100%
colossalai/shardformer/layer/embedding.py                                                    118      7    94%
colossalai/shardformer/layer/linear.py                                                       156     25    84%
colossalai/shardformer/layer/loss.py                                                          49     38    22%
colossalai/shardformer/layer/normalization.py                                                 50     13    74%
colossalai/shardformer/layer/parallel_module.py                                               76     62    18%
colossalai/shardformer/layer/qkv_fused_linear.py                                             196     50    74%
colossalai/shardformer/layer/utils.py                                                         81     10    88%
colossalai/shardformer/model/__init__.py                                                       0      0   100%
colossalai/shardformer/model/modeling_bert.py                                                 20     20     0%
colossalai/shardformer/policies/__init__.py                                                    0      0   100%
colossalai/shardformer/policies/autopolicy.py                                                 27      2    93%
colossalai/shardformer/policies/basepolicy.py                                                 46      6    87%
colossalai/shardformer/policies/bert.py                                                      117      2    98%
colossalai/shardformer/policies/bloom.py                                                      85      8    91%
colossalai/shardformer/policies/gpt2.py                                                       66      3    95%
colossalai/shardformer/policies/llama.py                                                      37      3    92%
colossalai/shardformer/policies/opt.py                                                        48      2    96%
colossalai/shardformer/policies/t5.py                                                         64      2    97%
colossalai/shardformer/policies/vit.py                                                        23     23     0%
colossalai/shardformer/shard/__init__.py                                                       4      0   100%
colossalai/shardformer/shard/shard_config.py                                                  19      2    89%
colossalai/shardformer/shard/sharder.py                                                       73      3    96%
colossalai/shardformer/shard/shardformer.py                                                   13      0   100%
colossalai/tensor/comm_spec.py                                                               253    179    29%
colossalai/tensor/d_tensor/__init__.py                                                         4      0   100%
colossalai/tensor/d_tensor/api.py                                                            136     27    80%
colossalai/tensor/d_tensor/comm_spec.py                                                      151     62    59%
colossalai/tensor/d_tensor/layout.py                                                          38      8    79%
colossalai/tensor/d_tensor/layout_converter.py                                               195     27    86%
colossalai/tensor/d_tensor/utils.py                                                           38     32    16%
colossalai/tensor/shape_consistency.py                                                       294    294     0%
colossalai/tensor/sharding_spec.py                                                           139    105    24%
colossalai/testing/__init__.py                                                                 4      0   100%
colossalai/testing/comparison.py                                                              54     20    63%
tests/kit/model_zoo/registry.py                                                               18      0   100%
tests/kit/model_zoo/transformers/__init__.py                                                   7      0   100%
tests/kit/model_zoo/transformers/bert.py                                                      42      0   100%
tests/kit/model_zoo/transformers/bloom.py                                                     34      0   100%
tests/kit/model_zoo/transformers/gpt.py                                                       28      0   100%
tests/kit/model_zoo/transformers/llama.py                                                     26      2    92%
tests/kit/model_zoo/transformers/opt.py                                                       32      0   100%
tests/kit/model_zoo/transformers/t5.py                                                        24      0   100%
tests/test_booster/test_mixed_precision/test_fp16_torch.py                                    30      1    97%
tests/test_checkpoint_io/test_gemini_checkpoint_io.py                                         80      0   100%
tests/test_fx/test_tracer/test_hf_model/hf_tracer_utils.py                                    21      2    90%
tests/test_fx/test_tracer/test_hf_model/test_hf_albert.py                                     17      1    94%
tests/test_fx/test_tracer/test_hf_model/test_hf_bert.py                                       15      1    93%
tests/test_fx/test_tracer/test_hf_model/test_hf_diffuser.py                                   50     28    44%
tests/test_fx/test_tracer/test_hf_model/test_hf_gpt.py                                        17      1    94%
tests/test_fx/test_tracer/test_hf_model/test_hf_opt.py                                        15      1    93%
tests/test_fx/test_tracer/test_hf_model/test_hf_t5.py                                         17      1    94%
tests/test_fx/test_tracer/test_torchrec_model/test_deepfm_model.py                            39      3    92%
tests/test_fx/test_tracer/test_torchrec_model/test_dlrm_model.py                              41      4    90%
tests/test_fx/test_tracer/test_torchvision_model/test_torchvision_model.py                    31      1    97%
tests/test_lazy/lazy_init_utils.py                                                            72     14    81%
tests/test_lazy/test_distribute.py                                                            73      3    96%
tests/test_lazy/test_models.py                                                                13      1    92%
tests/test_shardformer/__init__.py                                                             0      0   100%
tests/test_shardformer/test_model/__init__.py                                                  0      0   100%
tests/test_shardformer/test_model/_utils.py                                                   21      0   100%
tests/test_shardformer/test_model/test_shard_bert.py                                          45      1    98%
tests/test_shardformer/test_model/test_shard_bloom.py                                         45      1    98%
tests/test_shardformer/test_model/test_shard_gpt2.py                                          45      1    98%
tests/test_shardformer/test_model/test_shard_llama.py                                         47      1    98%
tests/test_shardformer/test_model/test_shard_opt.py                                           48      1    98%
tests/test_shardformer/test_model/test_shard_t5.py                                            50      1    98%
tests/test_shardformer/test_model/test_shard_vit.py                                           35     20    43%
tests/test_shardformer/test_with_torch_ddp.py                                                 46      2    96%
--------------------------------------------------------------------------------------------------------------
TOTAL                                                                                       5441   2110    61%

FrankLeeeee pushed a commit that referenced this pull request Jul 4, 2023
* [shardformer] add benchmark of shardformer

* [shardformer] add benchmark of shardformer
ver217 pushed a commit to ver217/ColossalAI that referenced this pull request Jul 13, 2023
…itech#4126)

* [shardformer] add benchmark of shardformer

* [shardformer] add benchmark of shardformer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

example example-related issuer or pull request shardformer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[shardformer] write an shardformer example with bert finetuning

2 participants