Skip to content

HausdorffDTLoss leads to GPU memory leak. #7480

@dimka11

Description

@dimka11

Describe the bug
Using this loss method with Trainer from transformers library (Pytorch) and YOLOv8 (Pytorch) leads to crash training shortly after start due to cuda out of memory.
16 gb gpu memory, batch size is 1 with 128*128 image. Training crash after ~ 100 iterations.

Environment

Kaggle Notebook, python 3.10.12, last monai version from pip.

Also reproduced this bug under Windows 11 with code from example:

%%time

import torch
import numpy as np
from monai.losses.hausdorff_loss import HausdorffDTLoss
from monai.networks.utils import one_hot

for i in range(0, 30):
    B, C, H, W = 16, 5, 512, 512
    input = torch.rand(B, C, H, W)
    target_idx = torch.randint(low=0, high=C - 1, size=(B, H, W)).long()
    target = one_hot(target_idx[:, None, ...], num_classes=C)
    self = HausdorffDTLoss(include_background=True ,reduction='none', softmax=True)
    loss = self(input.to('cuda'), target.to('cuda'))
    assert np.broadcast_shapes(loss.shape, input.shape) == input.shape

It ate about 5 gb memory, on the GPU consumption graph it looks like a flat line with several rises.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions