### Describe the feature In #2582, I requested a meta information patch for `torch.matmul()`, and this issue was resolved by #2584. Here is a checklist for operations that needed to be patched before we could combine our SPMD solver and auto activation checkpoint solver on GPT - [x] #2629 - [x] #2630 - [x] #2631 - [x] #2632 - [x] #2735 - [x] #2779 - [x] #2780 - [x] #2782 - [x] #2783 - [x] #2784 - [x] #2785 - [x] #2633 - [x] #2786 - [x] #2787 - [x] #2788 - [x] #2814 - [x] #2799 - [x] #2800 - [x] #2815
Describe the feature
In #2582, I requested a meta information patch for
torch.matmul(), and this issue was resolved by #2584.Here is a checklist for operations that needed to be patched before we could combine our SPMD solver and auto activation checkpoint solver on GPT
torch.nn.functional.softmax()#2629torch.tanh()#2630torch.nn.Dropout#2631torch.nn.LayerNorm#2632torch.nn.Embedding#2735torch.tensor()#2779torch.Tensor.size()#2780torch.Tensor.to()#2782torch.Tensor.type()#2783torch.Tensor.contiguous()#2784torch.Tensor.transpose()#2785torch.Tensor.unsqueeze()#2633torch.Tensor.permute()#2786torch.Tensor.split()#2787torch.Tensor.view()#2788torch.finfo()#2814torch.arange()#2799torch.where()#2800operator.le()#2815