Segmenter Shared Container Fix#6025
Conversation
…Fusion, not the entire IrContainer.
|
!test |
|
!test |
Greptile SummaryThis PR fixes a correctness bug in Key changes:
The Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["StatementGuard::~StatementGuard()"] --> B["fusion_->removeStatementsCreatedAfter\n(prev_num_exprs_, prev_num_vals_)"]
B --> C["Acquire unique_lock on mutex_"]
C --> D["ContainerMutator::removeStatementsCreatedAfter"]
D --> E{"sharing_fusions_.size() <= 1?"}
E -- "Yes (fast path)" --> F["LIFO pop-back from\nglobal deque tail"]
F --> G["while per_fusion_exprs_size > num_exprs_before:\n assert tail belongs to self\n setDefinition(nullptr) on outputs\n removeUse() on inputs\n erase from per_fusion_exprs_, exprs_\n pop_back exprs_up_"]
G --> H["while per_fusion_vals_size > num_vals_before:\n assert tail belongs to self\n nullOutShortcutIfNeeded()\n erase from per_fusion_vals_, vals_\n pop_back vals_up_"]
E -- "No (slow path)" --> I["std::erase_if on exprs_up_\n(deque may have interleaved entries)"]
I --> J["For each entry:\n skip if belongs to another Fusion\n keep if exprs_kept < num_exprs_before\n else: cleanup + erase + return true"]
J --> K["std::erase_if on vals_up_"]
K --> L["For each entry:\n skip if belongs to another Fusion\n keep if vals_kept < num_vals_before\n else: nullOutShortcutIfNeeded + erase + return true"]
Last reviewed commit: e2b2d34 |
csrc/fusion.cpp
Outdated
| // Slow path: shared container — other Fusions' statements may be | ||
| // interleaved at the tail of the global deques. Use std::erase_if | ||
| // (C++20) to scan forward: skip the first num_before of self's | ||
| // statements (old, to keep), then erase the remainder (added during | ||
| // the guard scope). Only taken on the error/rollback path when |
There was a problem hiding this comment.
The comment states this path is "Only taken on the error/rollback path when segment compilation fails," but StatementGuard::~StatementGuard() unconditionally calls removeStatementsCreatedAfter — it does not distinguish success from failure. The slow path is entered whenever the container is shared (sharing_fusions_.size() > 1), regardless of whether it is a compilation error. On a success path where no new statements were added, the two erase_if calls complete trivially (finding nothing to remove), but the O(n) scan still occurs.
Consider updating the comment to something like:
// Slow path: shared container — other Fusions' statements may be
// interleaved at the tail of the global deques. Use std::erase_if
// (C++20) to scan forward: skip the first num_before of self's
// statements (old, to keep), then erase the remainder (added during
// the guard scope). O(total statements in container); in the common
// case where no statements were added the scan is a no-op.| // self's new expr — remove (clean up uses and index maps first) | ||
| for (Val* in : e->inputs()) { | ||
| in->removeUse(e); | ||
| } | ||
| c->per_fusion_exprs_[self].erase(e); | ||
| c->exprs_.erase(e); | ||
| return true; |
There was a problem hiding this comment.
When return true; is returned from the erase_if predicate (line 282), the owning unique_ptr<Expr> is destroyed immediately, freeing the expression's memory. The output Vals of that expression still have their definition_ pointer set (to the now-freed memory) until the subsequent vals_up_ erase_if pass removes them (lines 284–300). Between the two erase_if calls there is nothing that dereferences definition_, so this is not a live bug today, but the fast path has the same gap and there is no assertion/documentation establishing this invariant.
This is consistent with the fast-path behaviour, but worth considering adding out->setDefinition(nullptr) here (as removeExpr does) to make the rollback more defensively correct, or at least a comment noting the transient dangling state is intentional.
…ullptr) Fix inaccurate comment claiming the slow path is only taken on the error/rollback path — it runs unconditionally whenever the container is shared. Add out->setDefinition(nullptr) for expr outputs before destruction in the slow path, matching removeExpr's behavior and eliminating the transient dangling definition_ pointer.
|
!test |
|
Review updated until commit e2b2d34 Description
|
| Relevant files | |||||
|---|---|---|---|---|---|
| Bug fix |
| ||||
| Cleanup |
|
PR Reviewer Guide
Here are some key observations to aid the review process:
| 🧪 PR contains tests |
| ⚡ Recommended focus areas for review |
Potential performance regression
(exprs_up_, vals_up_) each time StatementGuard cleanup is needed in shared container scenarios. This is O(total statements in container) as noted in the code comments. For containers with many statements, this could be slow. Consider if there's a more efficient data structure that could track per-fusion statement ordering. |
Apply the same setDefinition(nullptr) fix to the fast path for consistency with removeExpr and the slow path.
Additional Comments (1)
The The fast path behaves differently: Concrete example:
This diverges from the design intent stated at line 216–217: "These persist across StatementGuard scopes and must not be removed on rollback." Consider either:
At minimum, add a brief note in the slow-path comment acknowledging the intentional divergence to prevent future confusion. |
StatementGuard recorded num_vals_before as numValsExcludingShortcuts(), which subtracts the count of non-null shortcut pointers. When a shortcut val (e.g. oneVal()) is lazily created inside a guard scope, the total count rises by 1 but non-null shortcuts also rises by 1, so numValsExcludingShortcuts stays flat — neither path detected the new shortcut via the count condition. The fast path was masked by LIFO ordering (new shortcuts sit at the deque tail and get popped while removing other new vals). The slow path's unconditional isShortcutVal keep made the bug real: shortcut vals created inside the guard were permanently retained. Fix: record num_vals_before as the total val count (numVals()), making counting uniform. Fast path switches to std::ssize(per_fusion_vals_[self]). Slow path drops the isShortcutVal skip and adds nullOutShortcutIfNeeded when removing, so shortcuts created inside the guard are rolled back and their cache pointers nulled, matching the fast path's behavior.
Additional Comments (3)
After this PR, If the method is kept as part of a public API, the comment should be updated to reflect its actual purpose. If it has no remaining callers, it (and the |
Additional Comments (1)
The grep confirms the symbol appears only in Or simply delete lines 187–199 entirely. |
7b51485 to
e2b2d34
Compare
|
!test |
b1873c8
into
md/segmenter-container-sharing
Statements cleaned up by statement guard need to be popped from the specific fusion only, not the entire IrContainer.
Statements cleaned up by statement guard need to be popped from the specific fusion only, not the entire IrContainer.
Statements cleaned up by statement guard need to be popped from the specific fusion only, not the entire IrContainer