-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[Sharderformer] Support zbv in Sharderformer Policy #6150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
ver217
merged 59 commits into
hpcaitech:main
from
duanjunwen:feature/sharderformer_support_zbv
Jan 2, 2025
Merged
Changes from all commits
Commits
Show all changes
59 commits
Select commit
Hold shift + click to select a range
b31a052
[feat] Sharderformer support zbv
duanjunwen 5f89e7f
[feat] support chatglm2, command, deepseek for zbv
duanjunwen 41e1972
[feat] support zbv in shardformer policy:
duanjunwen 37a5a66
Merge branch 'main' into feature/sharderformer_support_zbv
duanjunwen efffe6b
[feat] support GPT2FusedLinearConv1D
duanjunwen 2b94e00
Merge branch 'main' into feature/sharderformer_support_zbv
duanjunwen a84fc41
[feat] support GPT2FusedLinear (without tp)
duanjunwen 014cc27
[fix] debug FusedConvLinear
duanjunwen 778d4df
[shardfromer] support gpt2 policy for zbv, support GPT2FusedLinearConv
duanjunwen 8cb74e7
Merge branch 'main' into feature/sharderformer_support_zbv
duanjunwen d168b73
[Shardformer] support FusedLinear1D base for zbv
duanjunwen 01a9cb3
[shardformer] support zbv in FusedLinear1D base, Col, Row
duanjunwen fc77b24
[shardformer] support zbv in blip2 and sam policy
duanjunwen 70b0ae1
[shardformer] fix bug incorrect number of gradients; add fusedLinear
duanjunwen 37b670e
[fix] fix incorrect number of gradients ;
duanjunwen 94bb9ec
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] dee1878
[Shardformer] add en doc for zbv;
duanjunwen 83e670e
[fix] fix typo in Model compatibility table
duanjunwen 2a55566
[fix] fix API Reference typo
duanjunwen 5430eb0
[Shardformer] add zh-Han doc for zbv
duanjunwen 25da23d
[fix] fix Linear name; update en & zh doc
duanjunwen fd5bd33
[fix] fix shardformer doc import err
duanjunwen c749a7c
[fix] fix shardconfig import in doc
duanjunwen eba4e33
[fix] fix shardformer doc
duanjunwen 3c5ce9e
[fix] fix shardconfig doc
duanjunwen 6bbe666
[fix] fix config
duanjunwen 3946366
[fix] remove shardconfig
duanjunwen b99c733
[fix] fix doc
duanjunwen 99a7829
[feat] add zbv doc string
duanjunwen f67ce86
[fix] rm doc
duanjunwen bbdcca1
[fix] fix doc
duanjunwen 9665f66
[fix] empty zbv doc
duanjunwen 568e2c5
[fix] ifx torch version
duanjunwen f8dc150
[fix] fix torch version
duanjunwen 1481b8d
[fix] fix torch versions
duanjunwen cb52e28
[fix] fix torch versions
duanjunwen 30e65e7
[fix] fix pyramid versions
duanjunwen 541664a
[fix] fix pyramid, zope version
duanjunwen ed76d69
Merge branch 'main' into feature/sharderformer_support_zbv
duanjunwen e592884
[fix] try fix workflow
duanjunwen 3b0669a
[fix] try import ShardConfig in yml
duanjunwen 1cd60a0
[fix] fix workflow
duanjunwen 573d5ce
[fix] fix workflow
duanjunwen 938bf6d
[fix] fix workflow
duanjunwen 90d1d53
[fix] fix workflow
duanjunwen 9b7940f
Merge branch 'main' into feature/sharderformer_support_zbv
duanjunwen aab6275
[fix] fix ci
duanjunwen 63b7db5
[fix] fix zbv doc
duanjunwen 7fb23a5
[fix] fix param for qkv linear, gpt2fused linear; fix requirments;
duanjunwen f0a8d78
[fix] fix policy use fused_linear
duanjunwen ff316c9
[fix] fix weight grad none, err caused by weight ptr change
duanjunwen f52c36e
[fix] fix comm in WeightGradStore
duanjunwen feca06e
[fix] fix WeightGradStore pop param
duanjunwen d74071a
[fix] remove useless param in doc; fix gpt2 qkv test;
duanjunwen c0b6fbc
[shardformer] simplify execute_w_pass_grad_accum;
duanjunwen 130b50c
[fix] rm useless comments
duanjunwen c4df1cc
[shardformer] simplify execute_w_pass_grad_accum & execute_w_pass
duanjunwen 52a3b88
[shardformer] Run meaningful doc test
duanjunwen ee6bba9
[shadformer] fix doc test cmd;
duanjunwen File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.