Currently, it check the entire generated string, which contains helper functions and the __global__ kernel. I think we should only check the kernel itself. Otherwise, PRs adding new helper functions (such as #995) will fail on all codegen_diff checks (see example pipeline https://github.com/NVIDIA/Fuser/actions/runs/6365268796/job/17282142084).