Accomodate name change of printed kernel files#3778
Merged
Conversation
Generated CUDA files were previously named like `__tmp_kernel_32.cu` and `compare_codegen.sh` matched that pattern when copying kernels. This PR changes that pattern to `__tmp_nvfuser_*.cu` instead, since these filenames were changed in #3468. This should fix the problems seen recently in codediff CI jobs.
Collaborator
Author
|
!test --diff |
PR Reviewer Guide 🔍(Review updated until commit 6a171b4)Here are some key observations to aid the review process:
|
Collaborator
Author
|
!test --diff |
Collaborator
Author
|
I believe CI failed in this instance because the commit on main still produces the mismatched CUDA/PTX filenames. I am reasonably confident that this will fix the CI codediff issues. |
naoyam
approved these changes
Jan 29, 2025
Collaborator
|
This is a bit concerning. Will see if it persists. |
jacobhinkle
added a commit
that referenced
this pull request
Jan 29, 2025
This uses the function name as the filename. Since #3778 we have been generating filenames like `__tmp_.ptx` since the `kernel_name` is not yet filled when cubin and ptx files are output. Now we will get `__tmp_nvfuser_none_f0_c0_r0_g0.{cu,cubin,ptx}` or if `NVFUSER_ENABLE=static_fusion_count` is provided `__tmp_nvfuser_1.{cu,cubin,ptx}`.
naoyam
pushed a commit
that referenced
this pull request
Jan 29, 2025
This uses the function name as the filename. Since #3778 we have been generating filenames like `__tmp_.ptx` since the `kernel_name` is not yet filled when cubin and ptx files are output. Now we will get `__tmp_nvfuser_none_f0_c0_r0_g0.{cu,cubin,ptx}` or if `NVFUSER_ENABLE=static_fusion_count` is provided `__tmp_nvfuser_1.{cu,cubin,ptx}`.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Generated CUDA files were previously named like
__tmp_kernel_32.cuandcompare_codegen.shmatched that pattern when copying kernels. That broke codediff since these filenames were changed in #3468. This PR fixescompare_codegen.shto match__tmp_*.cuinstead. It also fixes the outputs of printed PTX and cubin files so that they use the same base filenames: currently that is the previous naming scheme__tmp_kernel_32.cu(if usingNVFUSER_ENABLE=static_fusion_count).This should fix the problems seen recently in codediff CI jobs.