-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Describe the bug
When I was training the ViT with torch DistributedDataParallel, during backward, torch raises error and reports that
Parameters which did not receive grad for rank 0: vit.patch_embedding.cls_token
which means that the cls_token did not participate in the backward process.
I checked the implementation of ViT and PatchEmbeddingBlock and found the unused cls_token in monai.networks.blocks.patchembedding.py: PatchEmbeddingBlock.

To Reproduce
Steps to reproduce the behavior:
- set environment variable in shell
TORCH_DISTRIBUTED_DEBUG=INFO - train
ViTwithtorch DistributedDataParallel
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request