Merged
Conversation
21b327e to
3216c2b
Compare
wujingyue
reviewed
Apr 21, 2025
ec0e3d1 to
7b2a9b0
Compare
wujingyue
approved these changes
Apr 22, 2025
Co-authored-by: Jingyue Wu <wujingyue@gmail.com>
Co-authored-by: samnordmann <snordmann@nvidia.com>
Co-authored-by: Jingyue Wu <wujingyue@gmail.com>
Co-authored-by: Jingyue Wu <wujingyue@gmail.com>
23286e1 to
615a825
Compare
Member
Author
|
!test |
wujingyue
added a commit
that referenced
this pull request
Apr 23, 2025
This reverts commit 515e65e.
jjsjann123
pushed a commit
that referenced
this pull request
Apr 23, 2025
Reverts #4286 because of http://nv/eF0
nsarka
added a commit
that referenced
this pull request
May 5, 2025
Follow-up PR to #4286. The PR will insert allocate ops for every LaunchKernel output, and also insert a Deallocate right after the last use of every input expr in the Hostir container. It adds a test to check the number of Deallocate ops and the max memory usage is correct for an example fusion as well. --------- Co-authored-by: Jingyue Wu <wujingyue@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds the Deallocate HostIr, which erases an Allocation from the expression evaluator. It also modifies LaunchKernel to take in preallocated output arguments. Lastly, it adds a gtest which allocates and deallocates a buffer in a loop, then checks the memory used is 0 bytes.