Skip to content

okToRelayout always allow non-global tensors after scheduling.#5594

Merged
wujingyue merged 2 commits intomainfrom
wjy/alias
Dec 1, 2025
Merged

okToRelayout always allow non-global tensors after scheduling.#5594
wujingyue merged 2 commits intomainfrom
wjy/alias

Conversation

@wujingyue
Copy link
Collaborator

@wujingyue wujingyue commented Nov 26, 2025

FIXME: comments

@wujingyue
Copy link
Collaborator Author

!test

@github-actions
Copy link

github-actions bot commented Nov 26, 2025

Review updated until commit 4875590

Description

  • Allow non-global tensors to always be relayouted after scheduling in alias analysis

  • Set all tensor views with allocations to global memory type before segmentation

  • Improve handling of allocation domains for global vs local tensors

  • Add detailed comments explaining the memory type handling logic

Changes walkthrough

Relevant files
Enhancement
alias_analysis.cpp
Allow non-global tensor relayouting in alias analysis       

csrc/alias_analysis.cpp

  • Added early return for non-global tensors in okToRelayout function
  • Added comprehensive comments explaining alias analysis behavior for
    global vs local tensors
  • Simplified logic to always allow relayouting of non-global tensor
    views
  • +21/-0   
    mark_aliases_prepare.cpp
    Set global memory type for all TVs before segmentation     

    csrc/preseg_passes/mark_aliases_prepare.cpp

  • Added loop to set all tensor views with allocations to global memory
    type
  • Added explanatory comments about pre-segmentation memory type contract
  • Ensures all TVs are marked as global before alias analysis
  • +10/-0   

    PR Reviewer Guide

    Here are some key observations to aid the review process:

    🧪 PR contains tests
    ⚡ Recommended focus areas for review
    Incomplete Documentation

    The PR description only contains "FIXME: comments" which is incomplete. The actual code changes have good inline comments explaining the logic, but the PR description should be updated to clearly describe the purpose, scope, and impact of this change.

    bool okToRelayout(
        const TensorView* tv,
        const Layout& new_layout,
        const EmptyAllocationAs empty_allocation_as) {
      // Alias analysis only cares about maintaining allocation of global TVs.
      // Consider the following pattern after scheduling:
      // ```
      //   tv0: global
      //       |
      //   [permute]
      //       |
      //   tv1: local
      //       |
      //   [permute]
      //       |
      //   tv2: global
      // ```
      // Regardless of `tv1`'s allocation domain, `tv2` can still be evaluated as an
      // alias of `tv0`. However, if `tv1` is global (especially when it's also a
      // fusion output), `tv1`'s allocation domain can't be ignored because it's
      // visible to the caller of the fusion.
      if (tv->getMemoryType() != MemoryType::Global) {
        return true;
      }
    Missing Tests

    This appears to be a significant change to alias analysis behavior that could affect tensor memory allocation and layout decisions. No test files are visible in the diff, suggesting comprehensive tests should be added to verify the correctness of the new logic, especially for the case where local tensors in computation chains don't prevent global tensor aliasing.

    for (TensorView* tv : fusion->allTvs()) {
      if (tv->hasAllocation()) {
        // Alternatively, we could hold as a contract that all TVs pre
        // segmentation are global. If you prefer this, I can try it in a separate
        // PR. This sounds natural to me because without scheduling every TV is
        // indeed stored globally.
        tv->setMemoryType(MemoryType::Global);
      }
    }

    @greptile-apps
    Copy link
    Contributor

    greptile-apps bot commented Nov 26, 2025

    Greptile Overview

    Greptile Summary

    This PR simplifies the alias analysis logic by ensuring all tensors with allocations are marked as MemoryType::Global before pre-segmentation alias analysis, and always allowing non-global tensors to be relayouted regardless of allocation state.

    Key changes:

    • okToRelayout now unconditionally returns true for non-global memory tensors, removing the previous conditional check on empty_allocation_as
    • MarkAliasesPreparePass now explicitly sets MemoryType::Global for all tensors with allocations before running alias analysis
    • Added comprehensive comment explaining the rationale: non-global tensors (like local intermediate values) don't need allocation domain tracking since they're not visible to the fusion caller

    Design rationale:
    The change builds on the previous commit (93b4811) by simplifying the logic. Instead of conditionally checking both empty_allocation_as and memory type, the PR separates concerns: pre-segmentation passes ensure all allocated tensors are global, then okToRelayout can unconditionally allow relayout for non-global tensors (which by definition shouldn't exist pre-segmentation if they have allocations).

    Confidence Score: 4/5

    • This PR is safe to merge with low risk, though the title description mentions "FIXME: comments"
    • The logic simplification is sound and well-commented. The changes are coherent and build on the previous commit. However, the PR description contains "FIXME: comments" which suggests the author may want to add more documentation or context before finalizing
    • No files require special attention. The PR description should be updated to remove "FIXME: comments" placeholder

    Important Files Changed

    File Analysis

    Filename Score Overview
    csrc/alias_analysis.cpp 5/5 Simplifies okToRelayout by always allowing non-global tensors to be relayouted, removes conditional check on empty_allocation_as
    csrc/preseg_passes/mark_aliases_prepare.cpp 4/5 Adds logic to set all tensors with allocations to MemoryType::Global before alias analysis runs pre-segmentation

    Sequence Diagram

    sequenceDiagram
        participant Pass as MarkAliasesPreparePass
        participant Fusion as Fusion
        participant TV as TensorView
        participant Analysis as AliasAnalysisResult
        participant Finder as AliasFinder
        
        Pass->>Fusion: runPass(fusion)
        Pass->>Fusion: allTvs()
        Fusion-->>Pass: all TensorViews
        loop For each TensorView with allocation
            Pass->>TV: hasAllocation()
            TV-->>Pass: true
            Pass->>TV: setMemoryType(Global)
        end
        
        Pass->>Analysis: findAliases(fusion, kUndetermined)
        Analysis->>Finder: create AliasFinder
        loop For each expr in fusion
            Finder->>Finder: dispatch(expr)
            Finder->>Finder: okToRelayout(tv, layout, kUndetermined)
            alt tv is not Global memory
                Finder-->>Finder: return true (always allow)
            else tv is Global memory
                alt tv has no allocation
                    Finder-->>Finder: return true (can relayout)
                else tv has allocation
                    Finder->>Finder: check layout compliance
                    Finder-->>Finder: return isCompliant
                end
            end
        end
        Analysis->>Analysis: finalize()
        Analysis-->>Pass: alias analysis result
    
    Loading

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    1 file reviewed, no comments

    Edit Code Review Agent Settings | Greptile

    @wujingyue wujingyue marked this pull request as draft November 26, 2025 16:40
    @jjsjann123
    Copy link
    Collaborator

    :shipit:

    @wujingyue
    Copy link
    Collaborator Author

    !test

    @wujingyue wujingyue marked this pull request as ready for review November 29, 2025 05:34
    @wujingyue wujingyue requested a review from jjsjann123 November 29, 2025 05:34
    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    2 files reviewed, no comments

    Edit Code Review Agent Settings | Greptile

    // Alternatively, we could hold as a contract that all TVs pre
    // segmentation are global. If you prefer this, I can try it in a separate
    // PR. This sounds natural to me because without scheduling every TV is
    // indeed stored globally.
    Copy link
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    I think I agree with you, since we have clearMemorySpace that is used by every scheduler:

    Fuser/csrc/scheduler/utils.cpp

    Lines 1280 to 1289 in 5ce52e8

    // Reset inputs and outputs to global memory, everything else to local.
    void clearMemorySpace(Fusion* fusion) {
    for (auto tv : fusion->allTvs()) {
    if (tv->isFusionInput() || tv->isFusionOutput()) {
    tv->setMemoryType(MemoryType::Global);
    } else {
    tv->setMemoryType(MemoryType::Local);
    }
    }
    }

    @wujingyue wujingyue merged commit c8cec57 into main Dec 1, 2025
    62 checks passed
    @wujingyue wujingyue deleted the wjy/alias branch December 1, 2025 17:14
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    2 participants