Output HTML and JSON from codegen diff tool#996
Conversation
|
Thanks for adding the HTML output. That looks great! Does the python script Nvfuser_bench is different from all other test suites (nvfuser_tests, python tests) because it prints test name (benchmark name) after kernel dump file (PRINTING: __tmp_kernel ...) while all other test suites print test name before the kernel dump file. |
Yes it compares based on test name and it's specific to nvfuser_tests. We could update it pretty easily to also handle pytest since that prints the test name before the kernels
We could probably enable this too if we're clever, we just need to accept some regexes and plumb them around. |
|
Can we add an option to only dump limited number (say 200) of kernel comparisons? We can add a prompt at the end of that page saying something like, e.g. "Only dumped 200 of 10086 total mismatches. To dump all the mismatches, please do xxx". This is to make the generated file not being too huge in size. |
Good idea. There are a few other things I'd like to add before merging too so I'll add this to the list. |
Btw one of those is an option to exclude the preamble which can decrease file size a bit. Still if a change impacts tons of tests we could easily wind up with hundreds of diffs which is probably not ideal for a ci artifact. At least they seem to compress well.. |
I will use this but not show it directly. Instead, I'll parse it and show the info on each kernel line, along with possible index type change and number of lines added/removed.
This reduces file size considerably. The original 11MB uncompressed file is now 2.0MB.
xwang233
left a comment
There was a problem hiding this comment.
The generated webpage looks really cool! Thanks for adding this.
It was trivial, and might be helpful for CI?
|
@xwang233 I am not sure what impact this will have on CI when merged. Note that we no longer print diffs to screen by default. We can do that if needed with |
|
!build |
|
!build |
|
Should the python script be called after the bash script? The python script hasn't been integrated into the CI yet. I'll check those later today. |
Yes, |
The --show-diffs arg actually had no effect (oops). Fixed that also.
|
!build |
|
Thanks for that. The failure in nvfuser-ci with reference numbers doesn't necessarily mean it's the codegen-diff job failure. It could be something else flaky in network. Don't worry about that. |
|
Ah OK. If CI fails again I'll just merge without a |
This will show FAILED -> FAILED as well. The only hidden case is now SUCCESS -> SUCCESS
I have been chasing down codegen changes in #840 and #947 and have needed to dig through a lot of spurious diffs. I decided to extend the codegen diff tool to output HTML, and to also modify the diffing a bit. This PR:
tools/compare_codegen.shto output env information as well as addptxas_verbosedump option.nvfuser_index_t. If preambles between two runs differ, we report that with a warning and show the diff in the output.--htmloption totools/diff_codegen_nvfuser_tests.pywhich will write a self-contained HTML file holding all the differing kernels and diffs. To use this option you must have previously runpip install jinja2.--jsonoption totools/diff_codegen_nvfuser_tests.pywhich writes a JSON file containing all the information contained in the HTML file in an easier-to-parse format.--show-diffsargument.This lets us communicate code differences easily by sharing these files, which could be generated by our CI. An example output is attached.
Github doesn't support uploading html so I have uploaded a zipped example:
codediff_f7786819_feda1e1e_binary_tests.html.zip
Note that this file is probably typical for a medium sized change: it results in a zipped file size of 184KB and unzipped it is 2.1MB.
Some ideas left out of this PR that might be nice in the future:
nvfuser_testsoutput but alsonvfuser_benchandpytestoutput. We could also fall back to arbitrary command output where we just group everything to one big "test" if we can't associate each kernel with a specific test/benchmark.Fixes #1007