Skip to content

cron integration test memory error #1236

@wyli

Description

@wyli

Describe the bug
cron test error message:

======================================================================
ERROR: test_training (tests.test_integration_classification_2d.IntegrationClassification2D)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/MONAI/MONAI/tests/test_integration_classification_2d.py", line 221, in test_training
    self.data_dir, self.train_x, self.train_y, self.val_x, self.val_y, device=self.device
  File "/__w/MONAI/MONAI/tests/test_integration_classification_2d.py", line 95, in run_training_test
    outputs = model(inputs)

...

 File "/__w/MONAI/MONAI/monai/networks/nets/densenet.py", line 56, in forward
    return torch.cat([x, new_features], 1)
RuntimeError: CUDA out of memory. Tried to allocate 66.00 MiB (GPU 0; 15.75 GiB total capacity; 2.93 GiB already allocated; 55.88 MiB free; 2.98 GiB reserved in total by PyTorch)

https://github.com/Project-MONAI/MONAI/runs/1404214574?check_suite_focus=true#step:5:4468

couldn't reproduce this issue easily,
but probably because of multiple integration test jobs trying to use the same GPU device

cc @IsaacYangSLA @YuanTingHsieh @behxyz @yanchengnv

Additional context
cron_log.log

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions