Fix non-determinism in mma_utils::getTensorsRoles#947
Merged
jacobhinkle merged 5 commits intomainfrom Sep 26, 2023
Merged
Conversation
Collaborator
Author
|
There should be only one A, B, or D tensor; see |
jacobhinkle
added a commit
that referenced
this pull request
Oct 4, 2023
I have been chasing down codegen changes in #840 and #947 and have needed to dig through a lot of spurious diffs. I decided to extend the codegen diff tool to output HTML, and to also modify the diffing a bit. This PR: - Changes `tools/compare_codegen.sh` to output env information as well as add `ptxas_verbose` dump option. - Changes the diffs performed by that tool to ignore both the kernel name and the preamble. The preamble is estimated by skipping the typedef of `nvfuser_index_t`. If preambles between two runs differ, we report that with a warning and show the diff in the output. - Adds an `--html` option to `tools/diff_codegen_nvfuser_tests.py` which will write a self-contained HTML file holding all the differing kernels and diffs. To use this option you must have previously run `pip install jinja2`. - Adds a `--json` option to `tools/diff_codegen_nvfuser_tests.py` which writes a JSON file containing all the information contained in the HTML file in an easier-to-parse format. - Changes the default to not printing the diffs to STDOUT. This can be re-enabled with the `--show-diffs` argument. This lets us communicate code differences easily by sharing these files, which could be generated by our CI. An example output is attached. Github doesn't support uploading html so I have uploaded a zipped example: [codediff_f7786819_feda1e1e_binary_tests.html.zip](https://github.com/NVIDIA/Fuser/files/12793721/codediff_f7786819_feda1e1e_binary_tests.html.zip) Note that this file is probably typical for a medium sized change: it results in a zipped file size of 184KB and unzipped it is 2.1MB. Some ideas left out of this PR that might be nice in the future: - Handle not just `nvfuser_tests` output but also `nvfuser_bench` and `pytest` output. We could also fall back to arbitrary command output where we just group everything to one big "test" if we can't associate each kernel with a specific test/benchmark. - Show multiple commands in one HTML file. Especially if the first bullet is addressed, then we could have a single summary for our whole suite. - Include benchmark results. This could be done in another hidden div with a "benchmarks" button. It might be tricky especially if the number of benchmark items associated to each kernel is changed between commits, but it might also be handy to refer to benchmark regressions and have the codegen output one click away. Fixes #1007
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This sorts the output of
mma_utils::getTensorsRolesso that the matmul scheduler is repeatable. This should fix the false positives in codegen diff CI jobs. 🤞Fixes #799.