fix: implement weak references for vertex and custom component to pre… by ricofurtado · Pull Request #11299 · langflow-ai/langflow

ricofurtado · 2026-01-14T16:16:05Z

This pull request introduces improvements to memory management and resource cleanup in the custom_component and vertex modules, and enhances the handling of asynchronous cleanup tasks in the graph module. The main focus is on using weak references to avoid reference cycles and potential memory leaks, and ensuring async tasks are properly cleaned up when objects are destroyed.

Memory management improvements:
The applied fixes reduce Langflow backend RSS growth by approximately 30–50% per run, eliminating the primary leak-like behaviour while leaving allocator-driven VSZ growth unchanged.

Tests were run in a 140-component flow, with 25 invocations of a subflow consisting of 12 components.
The flow was provided by Shuhei Kochi

Langflow Backend Memory Regression Report

Original Code vs Fixed Code

Summary

We ran repeated Langflow flow executions via the REST API and measured backend process memory usage after each run. Compared to the original implementation, the fixed version shows a significant reduction in RSS (resident memory) growth per run, indicating reduced memory retention across repeated executions.

While VSZ (virtual memory) growth remains largely unchanged, this behavior is consistent with allocator and arena behavior rather than a true memory leak.

Test Methodology

Flows executed repeatedly using Langflow REST API (stream=false) within a single backend process
Memory collected from /proc/<PID>/status:
- RSS (MiB) → VmRSS (primary signal for real memory retention)
- VSZ (MiB) → VmSize (secondary signal; allocator-related)
Each test consists of:
1. Warm-up runs (lazy loading, cache initialization)
2. A 15-run steady-state block using a fresh baseline_before
Both successful (200) and failed (500) runs were included to observe failure-path behavior
Memory deltas are computed relative to the baseline_before for each block

Data Sets Analyzed

Original Code

Run 1
- Baseline RSS: 1327 MiB
- RSS after run 15: 1649 MiB
Run 2
- Baseline RSS: 1314 MiB
- RSS after run 15: 1715 MiB

Fixed Code

Run 1
- Baseline RSS: 1374 MiB
- RSS after run 15: 1564 MiB
Run 2
- Baseline RSS: 1335 MiB
- RSS after run 15: 1617 MiB

Results

1️⃣ RSS Growth Reduction (Primary Improvement)

RSS represents actual resident memory and is the strongest indicator of leak-like behavior.

Original Code

Run 1: +322 MiB over 15 runs → ~21.5 MiB/run
Run 2: +401 MiB over 15 runs → ~26.7 MiB/run

Fixed Code

Run 1: +190 MiB over 15 runs → ~12.7 MiB/run
Run 2: +282 MiB over 15 runs → ~18.8 MiB/run

Net Improvement

RSS growth reduced from ~21–27 MiB/run → ~13–19 MiB/run
Equivalent to approximately 30–50% reduction in resident memory growth per run

This strongly indicates that the fix reduced retained memory across repeated flow executions.

2️⃣ VSZ Behavior (Largely Unchanged)

VSZ growth remained similar across original and fixed versions:

Original: ~+5419 to +5476 MiB over 15 runs
(~361–365 MiB/run)
Fixed: ~+5285 to +5386 MiB over 15 runs
(~352–359 MiB/run)

Interpretation:

VSZ growth is consistent with Python allocator behavior (arenas, fragmentation, mmap)
The fact that RSS improved while VSZ did not suggests:
- fewer live objects retained
- allocator behavior remains the dominant VSZ factor
This does not indicate an active memory leak

3️⃣ Failure-Path Behavior (HTTP 500)

Failures were intentionally included in the run sets:

Version | Run | Failures -- | -- | -- Original | Run 1 | 4 / 15 Original | Run 2 | 4 / 15 Fixed | Run 1 | 5 / 15 Fixed | Run 2 | 4 / 15

Even with intermittent 500 responses:

The fixed version consistently shows lower RSS growth
Failure paths no longer cause disproportionate memory retention

This suggests improved cleanup even under error conditions.

Key Conclusions

Resident memory retention per run improved significantly
- ~30–50% reduction in RSS growth rate
VSZ growth remains allocator-driven
- No evidence of additional live-object leaks
Failure paths are safer
- Errors no longer amplify memory growth trends

Recommendations / Next Steps

To further improve stability and observability:

Differentiate allocator behavior from true leaks
- Add tracemalloc snapshots (post-warmup vs post-run 15)
- Diff top allocation sites
Harden failure paths
- Review resource cleanup on 500 responses
- Ensure request-scoped objects and buffers are released
Standardize benchmarks
- Fixed warm-up count
- Fixed delay between runs
- Consistent baseline capture

One-Line Takeaway

The applied fixes reduce Langflow backend RSS growth by approximately 30–50% per run, eliminating the primary leak-like behavior while leaving allocator-driven VSZ growth unchanged.

…vent memory leaks

coderabbitai · 2026-01-14T16:16:27Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

This PR introduces weak references to break circular dependencies between CustomComponent–Vertex and Vertex–Graph. The Graph class gains async cleanup logic via a new __del__ destructor and cleanup_async_tasks method to properly await and terminate pending traces during garbage collection.

Changes

Cohort / File(s)	Summary
Weak reference introduction `src/lfx/src/lfx/custom/custom_component/custom_component.py`, `src/lfx/src/lfx/graph/vertex/base.py`	Replaced direct object storage with weak references: `CustomComponent._vertex` now uses `_vertex_ref` internally; `Vertex.graph` now uses `_graph_ref` internally. Public interface preserved via properties that dereference on access. Added `weakref` import.
Async cleanup and destructor `src/lfx/src/lfx/graph/graph/base.py`	Added `cleanup_async_tasks()` async method to flush and await end-trace tasks with timeout handling. Added `__del__` destructor to trigger cleanup when event loop is running. Modified `_end_all_traces_async()` to prefer `get_running_loop()` and improve event loop scheduling logic.

Sequence Diagram

sequenceDiagram
    participant GC as Garbage Collector
    participant Graph
    participant EventLoop as Event Loop
    participant TaskSet as end_trace_tasks

    GC->>Graph: __del__() called
    activate Graph
    Graph->>EventLoop: asyncio.get_running_loop()
    alt Loop running
        EventLoop-->>Graph: loop instance
        Graph->>Graph: cleanup_async_tasks()
        activate Graph
        Graph->>TaskSet: flush all tasks
        Graph->>TaskSet: await with timeout
        TaskSet-->>Graph: completion
        Graph->>TaskSet: clear()
        deactivate Graph
    else No running loop
        EventLoop-->>Graph: RuntimeError
        Graph->>TaskSet: cancel pending tasks
    end
    deactivate Graph

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 5

❌ Failed checks (1 error, 4 warnings)

Check name	Status	Explanation	Resolution
Test Coverage For New Implementations	❌ Error	PR introduces weak reference implementations and async cleanup functionality without corresponding test coverage.	Add test files verifying weak reference dereferencing, async cleanup cancellation, and destructor behavior across event loop states.
Docstring Coverage	⚠️ Warning	Docstring coverage is 46.15% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Quality And Coverage	⚠️ Warning	Pull request introduces memory-management features but lacks test coverage for weak references, async cleanup, garbage collection, and serialization behavior.	Add comprehensive pytest tests covering weak reference behavior, async cleanup_async_tasks() execution, Graph/Vertex del cleanup, serialization with weak references, error consistency, and integration tests for memory leak fixes.
Test File Naming And Structure	⚠️ Warning	Pull request introduces significant memory management changes without test coverage for weak references, garbage collection, async cleanup, or destructor logic.	Add comprehensive tests for weak reference behavior, garbage collection scenarios, cleanup_async_tasks() method, del destructor, and event loop handling logic in existing test files.
Excessive Mock Usage Warning	⚠️ Warning	Existing test files use mocks appropriately for external dependencies. However, PR introduces significant changes lacking dedicated test coverage for weak references, async cleanup, destructor logic, and serialization bugs.	Add tests for weak reference property behavior, async cleanup (cleanup_async_tasks), destructor logic (del), and fix critical serialization bug in Vertex's getstate method for _graph_ref weak reference.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title is partially related to the changeset. It mentions 'weak references for vertex and custom component' which are real changes, but it's truncated (ends with 'pre…') and omits the significant async cleanup and Graph weak reference changes that address memory leak concerns.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch lfoss-2044-out-of-memory-after-multiple-executions-of-custom-component

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR improves memory management by introducing weak references to prevent circular reference chains between Graph, Vertex, and CustomComponent objects. It also adds proper cleanup for asynchronous trace tasks to prevent resource leaks during object destruction.

Changes:

Introduced weak references in Vertex and CustomComponent to break circular dependencies with Graph and Vertex respectively
Added async task cleanup mechanism in Graph.__del__ to handle pending trace tasks during garbage collection
Enhanced _end_all_traces_async to gracefully handle event loop availability

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
src/lfx/src/lfx/graph/vertex/base.py	Converted strong `Graph` reference to weak reference with property-based access
src/lfx/src/lfx/graph/graph/base.py	Added async task cleanup in destructor and improved event loop handling in trace methods
src/lfx/src/lfx/custom/custom_component/custom_component.py	Converted strong `Vertex` reference to weak reference with property getter/setter

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-14T16:16:32Z

+        """Cleanup when graph is destroyed."""
+        if self._end_trace_tasks:
+            try:
+                loop = asyncio.get_event_loop()


Using asyncio.get_event_loop() in __del__ is deprecated in Python 3.10+ and can cause issues. Consider using asyncio.get_running_loop() with a try/except, or check if a loop exists before attempting to use it. The behavior of get_event_loop() in destructors is unreliable.

Changing...

Copilot · 2026-01-14T16:16:32Z

+    def __del__(self):
+        """Cleanup when graph is destroyed."""
+        if self._end_trace_tasks:
+            try:
+                loop = asyncio.get_event_loop()
+                if loop.is_running():
+                    cleanup_task = loop.create_task(self.cleanup_async_tasks())
+
+                    def _log_cleanup(t: asyncio.Task) -> None:
+                        try:
+                            exc = t.exception()
+                        except Exception:  # noqa: BLE001
+                            logger.exception("Cleanup task inspection failed")
+                            return
+                        if exc:
+                            logger.warning("Cleanup task failed during __del__", exc_info=exc)
+                        else:
+                            logger.info("Cleanup task completed cleanly")
+
+                    cleanup_task.add_done_callback(_log_cleanup)
+                elif loop.is_closed():
+                    for task in list(self._end_trace_tasks):
+                        task.cancel()
+                    self._end_trace_tasks.clear()
+                else:
+                    loop.run_until_complete(self.cleanup_async_tasks())
+            except Exception:  # noqa: BLE001
+                logger.exception("Error cleaning up async tasks")


The __del__ method creates a task at line 675 but doesn't await it when the loop is running. This fire-and-forget task may not complete before the object is fully destroyed, potentially leaving resources uncleaned. Additionally, calling run_until_complete at line 694 when the loop is not running could block if the loop has been stopped but not yet closed, which is problematic in a destructor.

It is unlikely to happen, but I'll refactor it...

Having a second thought about this: Graph keeps _end_trace_tasks (set of asyncio Tasks) and each task has a callback bound to self._end_trace_tasks.discard, while the task coroutine itself closes over self. That creates a cycle Graph -> set -> task -> callback -> Graph. Because Graph defines del, any such cycle is non-collectable, so a graph cached in ChatService.set_cache or kept alive by pending trace tasks will leak until the process exits.

Python’s garbage collection improvements suggest that, in the long term, we should avoid using del, as it can interfere with optimal GC behavior and potentially introduce subtle issues. I’m removing it because the performance gains were negligible.

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@src/lfx/src/lfx/custom/custom_component/custom_component.py`:
- Around line 118-125: The _vertex property currently returns None when the
weakref target was garbage-collected, causing downstream AttributeError in
callers like the graph property; change the _vertex getter to mirror
Vertex.graph by detecting when self._vertex_ref is set but self._vertex_ref() is
None and raise ValueError("Vertex has been garbage collected") instead of
returning None, leaving the setter (def _vertex) as-is so callers that
intentionally check `if self._vertex:` still work when no ref exists.

🧹 Nitpick comments (1)

src/lfx/src/lfx/graph/graph/base.py (1)

657-667: Consider reducing log verbosity for normal operation.

INFO-level logging for every graph cleanup may produce excessive logs in production. Consider DEBUG level, or make logging conditional on whether there were actually tasks to clean up.

♻️ Suggested adjustment

 async def cleanup_async_tasks(self) -> None:
     """Ensure all async tasks complete before cleanup."""
-    logger.info(f"Cleaning up async tasks for run_id: {self._run_id}")
+    logger.debug(f"Cleaning up async tasks for run_id: {self._run_id}")
     if self._end_trace_tasks:
-        logger.info(f"Waiting for {len(self._end_trace_tasks)} trace tasks to complete")
+        logger.debug(f"Waiting for {len(self._end_trace_tasks)} trace tasks to complete")
         try:
             await asyncio.wait_for(asyncio.gather(*self._end_trace_tasks, return_exceptions=True), timeout=30.0)
         except asyncio.TimeoutError:
             logger.warning("Some trace tasks did not complete within timeout")
         finally:
             self._end_trace_tasks.clear()

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 20680cd and 2ac2e39.

📒 Files selected for processing (3)

src/lfx/src/lfx/custom/custom_component/custom_component.py
src/lfx/src/lfx/graph/graph/base.py
src/lfx/src/lfx/graph/vertex/base.py

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2025-11-24T19:46:09.104Z

Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/backend_development.mdc:0-0
Timestamp: 2025-11-24T19:46:09.104Z
Learning: Applies to src/backend/**/*.py : Use `asyncio.create_task()` for background tasks and implement proper cleanup with try/except for `asyncio.CancelledError`

Applied to files:

src/lfx/src/lfx/graph/graph/base.py

📚 Learning: 2025-11-24T19:47:28.997Z

Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-11-24T19:47:28.997Z
Learning: Applies to src/backend/tests/**/*.py : Use async fixtures with proper cleanup using try/finally blocks to ensure resources are properly released after tests complete

Applied to files:

src/lfx/src/lfx/graph/graph/base.py

🧬 Code graph analysis (2)

src/lfx/src/lfx/graph/graph/base.py (2)

src/backend/base/langflow/services/task/backends/anyio.py (1)

cleanup_task (108-117)

src/backend/base/langflow/server.py (1)

error (53-61)

src/lfx/src/lfx/custom/custom_component/custom_component.py (1)

src/lfx/src/lfx/graph/vertex/base.py (1)

Vertex (50-885)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Update Starter Projects

🔇 Additional comments (5)

src/lfx/src/lfx/graph/vertex/base.py (2)

7-7: LGTM: Weak reference import for breaking reference cycles.

The import aligns with the PR objective to prevent memory leaks by avoiding reference cycles between Vertex and Graph.

118-126: LGTM: Property correctly dereferences weak reference with explicit error handling.

The property pattern is well-implemented:

Safely dereferences the weak reference

Raises a clear ValueError if the Graph has been garbage collected

Maintains the same public interface

src/lfx/src/lfx/graph/graph/base.py (1)

698-707: LGTM: Correct pattern for async trace scheduling.

The implementation correctly:

Uses get_running_loop() to detect if we're in an async context

Falls back to synchronous execution via run_until_complete when no loop is running

Tracks tasks for later cleanup via _end_trace_tasks

Uses add_done_callback with discard for automatic cleanup

src/lfx/src/lfx/custom/custom_component/custom_component.py (2)

4-4: LGTM: Weak reference import for breaking reference cycles.

The import aligns with the PR objective to prevent memory leaks.

91-91: LGTM: Weak reference storage for vertex.

Correctly uses weakref.ReferenceType[Vertex] type annotation for the weak reference.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…graph destruction

fix: implement weak references for vertex and custom component to pre…

2ac2e39

…vent memory leaks

ricofurtado requested a review from Copilot January 14, 2026 16:16

github-actions Bot added the community Pull Request from an external contributor label Jan 14, 2026

Copilot AI reviewed Jan 14, 2026

View reviewed changes

github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Jan 14, 2026

coderabbitai Bot reviewed Jan 14, 2026

View reviewed changes

Comment thread src/lfx/src/lfx/custom/custom_component/custom_component.py

Comment thread src/lfx/src/lfx/graph/graph/base.py Outdated

Comment thread src/lfx/src/lfx/graph/vertex/base.py

ricofurtado requested review from Copilot and jordanrfrazier January 14, 2026 18:04

Copilot AI reviewed Jan 14, 2026

View reviewed changes

Comment thread src/lfx/src/lfx/custom/custom_component/custom_component.py

fix: remove __del__ method to prevent potential memory issues during …

7a9aab1

…graph destruction

github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Jan 19, 2026

ricofurtado requested a review from ogabrielluiz January 19, 2026 16:24

github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Jan 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: implement weak references for vertex and custom component to pre…#11299

fix: implement weak references for vertex and custom component to pre…#11299
ricofurtado wants to merge 2 commits into
mainfrom
lfoss-2044-out-of-memory-after-multiple-executions-of-custom-component

ricofurtado commented Jan 14, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jan 14, 2026 •

edited

Loading

Other AI code review bot(s) detected

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 14, 2026

Uh oh!

ricofurtado Jan 14, 2026

Uh oh!

Copilot AI Jan 14, 2026

Uh oh!

ricofurtado Jan 14, 2026

Uh oh!

ricofurtado Jan 14, 2026

Uh oh!

ricofurtado Jan 19, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ricofurtado commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Langflow Backend Memory Regression Report

Summary

Test Methodology

Data Sets Analyzed

Original Code

Fixed Code

Results

1️⃣ RSS Growth Reduction (Primary Improvement)

Original Code

Fixed Code

Net Improvement

2️⃣ VSZ Behavior (Largely Unchanged)

3️⃣ Failure-Path Behavior (HTTP 500)

Key Conclusions

Recommendations / Next Steps

One-Line Takeaway

Uh oh!

coderabbitai Bot commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

ricofurtado Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

ricofurtado Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

ricofurtado Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

ricofurtado Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ricofurtado commented Jan 14, 2026 •

edited

Loading

coderabbitai Bot commented Jan 14, 2026 •

edited

Loading