assert bool((rel_pair_idx == pair_idx[vr_indices]).all())

Dear authors, 

Congratulations for the very nice work! I ran your code for SGDET and I got an assertion error. In particular, I ran this command:

```
CUDA_VISIBLE_DEVICES=6 \
python tools/relation_train_net.py \
 --config-file "configs/e2e_relation_X_101_32_8_FPN_1x.yaml" \
 MODEL.ROI_RELATION_HEAD.USE_GT_BOX False \
 MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL False \
 MODEL.ROI_RELATION_HEAD.PREDICTOR RUNetPredictor \
 SOLVER.IMS_PER_BATCH 1 \
 TEST.IMS_PER_BATCH 1 \
 DTYPE "float16" \
 SOLVER.PRE_VAL True \
 SOLVER.BASE_LR 0.0025 \
 MODEL.ROI_RELATION_HEAD.L21_LOSS 0.7 \
 MODEL.PRETRAINED_DETECTOR_CKPT ~/checkpoints/pretrained_faster_rcnn/model_final.pth \
 OUTPUT_DIR ~/checkpoints/runet-sgdet
``` 

and I got the exception:

```
maskrcnn_benchmark INFO: -------------------------------
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 32768.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 16384.0
Traceback (most recent call last):
  File "tools/relation_train_net.py", line 379, in <module>
    main()
  File "tools/relation_train_net.py", line 372, in main
    model = train(cfg, args.local_rank, args.distributed, logger)
  File "tools/relation_train_net.py", line 147, in train
    loss_dict = model(images, targets)
  File "/anaconda3/envs/ru_net/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/anaconda3/envs/ru_net/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 447, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/anaconda3/envs/ru_net/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/ru_net/RU-Net/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 52, in forward
    x, result, detector_losses = self.roi_heads(features, proposals, targets, logger)
  File "/anaconda3/envs/ru_net/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/ru_net/RU-Net/maskrcnn_benchmark/modeling/roi_heads/roi_heads.py", line 69, in forward
    x, detections, loss_relation = self.relation(features, detections, targets, logger)
  File "/anaconda3/envs/ru_net/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/ru_net/RU-Net/maskrcnn_benchmark/modeling/roi_heads/relation_head/relation_head.py", line 94, in forward
    refine_logits, relation_logits, add_losses = self.predictor(proposals, rel_pair_idxs, full_pair_idxs, rel_labels, rel_binarys, roi_features, union_features, logger)
  File "/anaconda3/envs/ru_net/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/ru_net/RU-Net/maskrcnn_benchmark/modeling/roi_heads/relation_head/roi_relation_predictors.py", line 819, in forward
    assert bool((rel_pair_idx == pair_idx[vr_indices]).all())
```

Notice that I got the same assertion error when trying with multiple GPUs, i.e., when running this command:

```
CUDA_VISIBLE_DEVICES=6,7 \
python -m torch.distributed.launch \
 --master_port 15026 \
 --nproc_per_node=2 \
 tools/relation_train_net.py \
 --config-file "configs/e2e_relation_X_101_32_8_FPN_1x.yaml" \
 MODEL.ROI_RELATION_HEAD.USE_GT_BOX False \
 MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL False \
 MODEL.ROI_RELATION_HEAD.PREDICTOR RUNetPredictor \
 SOLVER.IMS_PER_BATCH 2 \
 TEST.IMS_PER_BATCH 2 \
 DTYPE "float16" \
 SOLVER.PRE_VAL True \
 SOLVER.BASE_LR 0.0025 \
 MODEL.ROI_RELATION_HEAD.L21_LOSS 0.7 \
 MODEL.PRETRAINED_DETECTOR_CKPT ~/checkpoints/pretrained_faster_rcnn/model_final.pth \
 OUTPUT_DIR ~/checkpoints/runet-sgdet-2gpus
``` 

Any suggestions for fixing the issue? 

Many thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assert bool((rel_pair_idx == pair_idx[vr_indices]).all()) #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

assert bool((rel_pair_idx == pair_idx[vr_indices]).all()) #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions