Skip to content

Fix potential memory leak in sparse_attn_decode_interface#174

Open
NJX-njx wants to merge 1 commit intodeepseek-ai:mainfrom
NJX-njx:fix/use-unique-ptr-for-decode-impl
Open

Fix potential memory leak in sparse_attn_decode_interface#174
NJX-njx wants to merge 1 commit intodeepseek-ai:mainfrom
NJX-njx:fix/use-unique-ptr-for-decode-impl

Conversation

@NJX-njx
Copy link
Copy Markdown

@NJX-njx NJX-njx commented Apr 14, 2026

Problem

In csrc/api/sparse_decode.h, the function sparse_attn_decode_interface allocates a DecodeImplBase object via raw new (line 362-381) and manually deletes it at the end of the function (line 492).

However, between allocation and deallocation, there are numerous operations that can throw exceptions:

  • Multiple TORCH_CHECK / KU_CHECK_* validation calls (lines 442-449)
  • Tensor allocation via torch::empty (lines 420-421, 456-457)
  • Kernel launches and CUDA operations

If any of these throw, the delete impl at line 492 is never reached, causing a memory leak. In a long-running inference service, repeated invalid requests would accumulate leaked DecodeImplBase objects.

Fix

Replace the raw new/delete pattern with std::unique_ptr<DecodeImplBase>, which guarantees cleanup via RAII regardless of the exit path (normal return or exception).

Changes

  • Add #include <memory>
  • DecodeImplBase* impl -> std::unique_ptr<DecodeImplBase> impl
  • new Decode_*_Impl() -> std::make_unique<Decode_*_Impl>()
  • Remove the manual delete impl;

After

The impl object is now automatically destroyed when it goes out of scope, whether the function returns normally or exits via an exception. No behavioral change to the normal (non-exception) code path.

Test plan

  • This is a mechanical refactor (raw ptr -> unique_ptr). No change in logic or kernel behavior.
  • The unique_ptr uses the same virtual destructor chain as the previous delete call.
  • Existing tests (test_flash_mla_sparse_decoding.py) should pass unchanged.

Replace raw `new`/`delete` of `DecodeImplBase` with `std::unique_ptr`
to ensure proper cleanup when an exception (e.g. from TORCH_CHECK) is
thrown between allocation and the manual `delete` at the end of the
function.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 14, 2026 07:20
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors sparse_attn_decode_interface to use RAII for DecodeImplBase lifetime management, preventing leaks if exceptions occur before the manual cleanup path.

Changes:

  • Replaced raw new/delete of DecodeImplBase with std::unique_ptr + std::make_unique.
  • Added <memory> include and removed the manual delete at function end.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@NJX-njx
Copy link
Copy Markdown
Author

NJX-njx commented Apr 14, 2026

Local Test Results

Since the target GPU architectures (SM90a/SM100f) are not available locally, I wrote a standalone C++ test that reproduces the exact class hierarchy (DecodeImplBase + derived classes) and validates the exception-safety behavior of both the old (raw new/delete) and new (std::unique_ptr) code paths.

Test Code

// Minimal reproduction of the DecodeImplBase hierarchy
static int g_destructor_call_count = 0;

class DecodeImplBase {
public:
    virtual ~DecodeImplBase() { g_destructor_call_count++; }
    virtual void run() = 0;
};

// OLD code: leaks on exception
void old_interface_raw_ptr(bool should_throw) {
    DecodeImplBase* impl = new Decode_Sm90_Impl();
    FAKE_TORCH_CHECK(!should_throw, "...");  // throws
    impl->run();
    delete impl;  // never reached on throw
}

// NEW code: safe on exception
void new_interface_unique_ptr(bool should_throw) {
    std::unique_ptr<DecodeImplBase> impl = std::make_unique<Decode_Sm90_Impl>();
    FAKE_TORCH_CHECK(!should_throw, "...");  // throws
    impl->run();
    // RAII cleanup
}

Results (MSVC 19.44, C++20, Windows x64)

[PASS] Test 1a: raw ptr normal path - destructor called
[PASS] Test 1b: unique_ptr normal path - destructor called
[PASS] Test 2a: raw ptr exception path - destructor NOT called (confirms the leak)
[PASS] Test 2b: unique_ptr exception path - destructor called (leak fixed!)
[PASS] Test 3: polymorphic unique_ptr - both destructors called

=== Results: 5 passed, 0 failed ===
  • Test 2a confirms the leak exists in the original code: when an exception is thrown, the destructor is never called.
  • Test 2b confirms the fix works: unique_ptr ensures the destructor is called even on exception.
  • Test 3 confirms polymorphic unique_ptr<Base> correctly invokes derived destructors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants