Skip to content

[PROPOSAL]: Speed improvement of Intra-Op plan generation in ColossalAuto #5436

@stephankoe

Description

@stephankoe

Proposal

Generating an Inter-Op plan with ColossalAuto takes usually a 1-2 minutes when running examples/tutorial/auto_parallel/auto_parallel_with_resnet.py. Profiling with cProfile reveals that a large portion of this time is consumed by calling copy.deepcopy, especially in the method DimSpec.build_difference_2d_dict(). Since many DimSpec objects are created1, that function is also called hundreds of thousands of times. Upon closer examination of the logic in this function, it becomes apparent that the result of this method is in fact independent of the DimSpec object, and its content is not mutated throughout its lifetime. Hence, it suffices to only create this dict once and share it among all instances of DimSpec. Due to the large quantity of DimSpec instances created throughout the plan generation, this change can introduce a speed-up of up to 50%2.

Self-service

  • I'd be willing to do some initial work on this proposal myself.

Footnotes

  1. many of which are just empty placeholders btw

  2. when running examples/tutorial/auto_parallel/auto_parallel_with_resnet.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions