Skip to content

[autockpt] provide option for activation checkpoint search in SPMD solver#2258

Merged
Cypher30 merged 10 commits intomainfrom
debug/ckpt-autoparallel
Jan 4, 2023
Merged

[autockpt] provide option for activation checkpoint search in SPMD solver#2258
Cypher30 merged 10 commits intomainfrom
debug/ckpt-autoparallel

Conversation

Cypher30 and others added 9 commits January 2, 2023 16:25
…ivation checkpoint (#2248)

* [autoparallel] hook node meta on graph nodes for checkpoint solver

* [autoparallel] polish code

* [autoparallel] restore some node handlers

* colossalai/auto_parallel/passes/meta_info_prop.py

* [autoparallel] remove some unused import

* [autoparallel] hook bwd_mem_out
* [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline

* [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop

* [autoparallel] specifycomm nodes' memory cost in construct chain
* [autockpt] make it work.

* [autockpt] linearize / merge shape-consistency nodes.
* [autockpt] make it work.

* [autockpt] linearize / merge shape-consistency nodes.

* [autockpt] considering parameter and optimizer weights.
* [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline

* [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop

* [autoparallel] specifycomm nodes' memory cost in construct chain

* [autoparallel] fix wrong runtime apply calculation

* [autoparallel] fix wrong runtime apply calculation

* [autoparallel] fix wrong runtime apply calculation
* [autockpt] make it work.

* [autockpt] linearize / merge shape-consistency nodes.

* [autockpt] considering parameter and optimizer weights.

* [hotfix] pass a parameter.
@super-dainiu super-dainiu marked this pull request as ready for review January 3, 2023 10:06
@super-dainiu super-dainiu requested review from Cypher30, FrankLeeeee and YuliangLiu0306 and removed request for Cypher30 and FrankLeeeee January 3, 2023 10:08
…_OP metainfo (#2293)

* [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline

* [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop

* [autoparallel] specifycomm nodes' memory cost in construct chain

* [autoparallel] fix wrong runtime apply calculation

* [autoparallel] fix wrong runtime apply calculation

* [autoparallel] fix wrong runtime apply calculation

* [autoparallel] bypass metainfo when available and modify BCAST_FUNC_OP
@Cypher30 Cypher30 merged commit d45695d into main Jan 4, 2023
@super-dainiu super-dainiu deleted the debug/ckpt-autoparallel branch January 4, 2023 06:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants