Skip to content

Add ResizeHeuristic#3674

Merged
naoyam merged 8 commits intomainfrom
resize_scheduler_minor_update_heuristic
Jan 14, 2025
Merged

Add ResizeHeuristic#3674
naoyam merged 8 commits intomainfrom
resize_scheduler_minor_update_heuristic

Conversation

@naoyam
Copy link
Collaborator

@naoyam naoyam commented Jan 6, 2025

Currently only has one parameter. Also added some minor tweaks.

  • Previously gridDim.x was static, which is now symbolic.
  • Rejects transpose-like patterns for now as they would need scheduling like what the transpose does.

Currently only has one parameter
@naoyam
Copy link
Collaborator Author

naoyam commented Jan 6, 2025

!test

@naoyam naoyam added the rope label Jan 6, 2025
@naoyam naoyam requested a review from jjsjann123 January 7, 2025 00:01
@naoyam
Copy link
Collaborator Author

naoyam commented Jan 7, 2025

!test

@naoyam
Copy link
Collaborator Author

naoyam commented Jan 7, 2025

!test

@naoyam
Copy link
Collaborator Author

naoyam commented Jan 8, 2025

!test

@naoyam
Copy link
Collaborator Author

naoyam commented Jan 10, 2025

!test

const int64_t bdimx = 128;

const auto& [largest_output, max_num_elms] =
getLargestTensor(fusion->outputs(), runtime_info);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we are not using largest_output. Is there another use of the function getLargestTensor where we need a pointer to the tensor?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be used in follow-up PRs

@naoyam
Copy link
Collaborator Author

naoyam commented Jan 14, 2025

!build

@naoyam naoyam merged commit 5797300 into main Jan 14, 2025
6 checks passed
@naoyam naoyam deleted the resize_scheduler_minor_update_heuristic branch January 14, 2025 21:28
naoyam added a commit that referenced this pull request Jan 14, 2025
Currently only has one parameter. Also added some minor tweaks. 

- Previously gridDim.x was static, which is now symbolic. 
- Rejects transpose-like patterns for now as they would need scheduling
like what the transpose does.
naoyam added a commit that referenced this pull request Jan 15, 2025
Depends on #3674, #3675, #3679

Reorder tensors to align with the largest input. This should improve
memory accesses by minimizing strides. Store throughputs may be lowered,
but it should generally be more important to optimize load accesses.

I do not have actual performance results by this change. I just remember
this was effective in some cases while manually trying out different
optimization strategies. We may eventually need to enable or disable
this reordering by some heuristic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants