[FEATURE]: Meta information patch for `torch.matmul`

### Describe the feature

To provide strategies with our auto-parallel system, the meta information (including compute cost and memory cost during the training process) is the key factor in generating a good strategy. We have done a lot of work previously, including the `MetaTensor` #1515 mechanism. With the `MetaTensor` mechanism, we are able to access the lower-level operations during the training process. Then we construct profiler #1587 on top of this mechanism. However, we found that the profiler itself might be time-consuming when inspecting all the operations in the model. Therefore, we decided to make use of `MetaTensor` to retrieve the Aten-level graph and turn all the meta information calculation into the static formula, so that we could avoid calling `torch.autograd.backward` every time we want to profile a `PyTorch` operation. We have done a lot so far and manage to combine the SPMD solver and auto activation checkpoint solver on `ResNet` #2258.

In order to combine the SPMD solver and auto activation checkpoint solver on transformer-based networks, we need to patch the `torch.matmul` operation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE]: Meta information patch for `torch.matmul` #2582

Describe the feature

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE]: Meta information patch for torch.matmul #2582

Description

Describe the feature

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[FEATURE]: Meta information patch for `torch.matmul` #2582