Skip to content

Idiomatic multi-gpu usage #37

@s1ddok

Description

@s1ddok

Describe the question.

I'm trying to utilize this library in multi-gpu torch settings and so far I'm getting no luck with frequent core dumped error, that doesn't tell me much.

my current approach is this (LLM generated):

            dev_idx = _torch.cuda.current_device() if _torch.cuda.is_available() else 0
            with cp.cuda.Device(dev_idx):
                encoder = nvimgcodec.Encoder()
                output_path = frame_dir / f"{to_stem(name)}.tiff"
                gpu_arr = cp.asarray(result.cpu())
                encoder.write(str(output_path), gpu_arr)
                # Release GPU memory retained by CuPy's pool to avoid OOM across concurrent saves
                try:
                    cp.get_default_memory_pool().free_all_blocks()
                except Exception:
                    pass

from nvidia-smi I see that most of the memory is being allocated on GPU 0 and it feels like nvImageEncoder is not explicitly aware of what GPU it is expected to run on. Can you provide example on how to utilize it and keep tensors on GPU?

Check for duplicates

  • I have searched the open bugs/issues and have found no duplicates for this bug report

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions