[hotfix] fix ColoTensor GPT2 unitest#1309
Conversation
|
|
||
| for i, (input_ids, attn_mask) in enumerate(train_dataloader): | ||
| logits = model(input_ids, attn_mask) | ||
| colo_input = ColoTensor.from_torch_tensor(input_ids, ColoTensorSpec(pg)) |
There was a problem hiding this comment.
do we really need to convert input to ColoTensor?
What if pass a torch tensor into the colo model?
There was a problem hiding this comment.
I'm not sure whether this line is necessary.
Since we have a problem in transformation between [dp: 4, tp: 1] and [dp: 1, tp: 4], I think we'd better add this line to ensure that this transformation would not be triggered. Otherwise, to_replicate may use PG, [dp: 4, tp: 1] and the tensor wouldn't be replicated when we want to get 4 copy from it.
There was a problem hiding this comment.
if the dp degree is consist among the model parameters. I believe we should pass the torch tensor into the model.
There was a problem hiding this comment.
Did you test the correctness in the original way?
logits = model(input_ids, attn_mask)
There was a problem hiding this comment.
if the dp degree is consist among the model parameters. I believe we should pass the torch tensor into the model.
I don't think so. Acctually, passed data has its own ditribution information. In TP environment, passed data is replicated among TP process group. In DP environment, passed data is just a part of batched data. It is not aprropriate to regard them having no difference with odinary tensors.
No description provided.