For example, they show already-compiled PTX assembly for cuda kernels instead of stmt ir, because those have already been offloaded. As more and more of codegen creeps into lowering, this problem will get worse. We need to identify a point in lowering at which the stmt should be preserved for stmt and stmt_html output. I propose just after the custom passes and before hexagon kernel offload.
See #7507