Skip to content

[Issue]: Paged Attention v1 Kernel Compilation Issue #637

@tjtanaa

Description

@tjtanaa

Problem Description

AITER commit: ea160c767739ebc437e928938ce4e91a4b78fbd9 (Jul 9, 2025)

When running vLLM, the aiter triggers removal of /root/.aiter

rm: cannot remove '/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/ck': Directory not empty
rm: cannot remove '/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/ck/tensor_operation/gpu/block': Directory not empty
rm: cannot remove '/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/ck/tensor_operation/gpu/block': Directory not empty
rm: cannot remove '/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/ck/tensor_operation/gpu/device/impl': Directory not empty
rm: cannot remove '/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/ck/tensor_operation/gpu/block': Directory not empty
rm: cannot remove '/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/ck/tensor_operation/gpu/device/impl': Directory not empty
rm: cannot remove '/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/ck/tensor_operation/gpu/grid': Directory not empty
rm: cannot remove '/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/ck/tensor_operation/gpu/device/impl': Directory not empty
rm: cannot remove '/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/ck/tensor_operation/gpu/grid': Directory not empty
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522] WorkerProc hit an exception.
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522] Traceback (most recent call last):
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/app/upgradeupstreamaiter/upgrade-aiter/vllm/v1/executor/multiproc_executor.py", line 517, in worker_busy_loop
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     output = func(*args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return func(*args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/app/upgradeupstreamaiter/upgrade-aiter/vllm/v1/worker/gpu_worker.py", line 317, in execute_model
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     output = self.model_runner.execute_model(scheduler_output,
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return func(*args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/app/upgradeupstreamaiter/upgrade-aiter/vllm/v1/worker/gpu_model_runner.py", line 1374, in execute_model
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     model_output = self.model(
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return self._call_impl(*args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return forward_call(*args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/app/upgradeupstreamaiter/upgrade-aiter/vllm/model_executor/models/mllama4.py", line 850, in forward
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return self.language_model(input_ids, positions, intermediate_tensors,
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return self._call_impl(*args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return forward_call(*args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/app/upgradeupstreamaiter/upgrade-aiter/vllm/model_executor/models/llama.py", line 584, in forward
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     model_output = self.model(input_ids, positions, intermediate_tensors,
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/app/upgradeupstreamaiter/upgrade-aiter/vllm/compilation/decorators.py", line 246, in __call__
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     model_output = self.forward(*args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/app/upgradeupstreamaiter/upgrade-aiter/vllm/model_executor/models/llama.py", line 368, in forward
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     def forward(
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return self._call_impl(*args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return forward_call(*args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 838, in _fn
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return fn(*args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/fx/graph_module.py", line 830, in call_wrapped
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return self._wrapped_call(self, *args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/fx/graph_module.py", line 406, in __call__
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     raise e
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/fx/graph_module.py", line 393, in __call__
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return self._call_impl(*args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return forward_call(*args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "<eval_with_key>.98", line 350, in forward
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     submod_1 = self.submod_1(getitem, s0, getitem_1, getitem_2, getitem_3);  getitem = getitem_1 = getitem_2 = submod_1 = None
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/fx/graph_module.py", line 830, in call_wrapped
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return self._wrapped_call(self, *args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/fx/graph_module.py", line 406, in __call__
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     raise e
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/fx/graph_module.py", line 393, in __call__
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return self._call_impl(*args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return forward_call(*args, **kwargs)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "<eval_with_key>.2", line 5, in forward
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     unified_attention_with_output = torch.ops.vllm.unified_attention_with_output(query_2, key_2, value, output_1, 'language_model.model.layers.0.self_attn.attn');  query_2 = key_2 = value = output_1 = unified_attention_with_output = None
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/local/lib/python3.10/dist-packages/torch/_ops.py", line 1158, in __call__
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return self._op(*args, **(kwargs or {}))
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/app/upgradeupstreamaiter/upgrade-aiter/vllm/attention/layer.py", line 452, in unified_attention_with_output
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     self.impl.forward(self,
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/app/upgradeupstreamaiter/upgrade-aiter/vllm/v1/attention/backends/rocm_aiter_fa.py", line 586, in forward
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     aiter.paged_attention_v1(
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/app/upgradeupstreamaiter/aiter/aiter/ops/attention.py", line 139, in paged_attention_v1
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     paged_attention_v1_core(
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/app/upgradeupstreamaiter/aiter/csrc/cpp_itfs/pa/pa_v1.py", line 126, in paged_attention_v1
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     func = compile(
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/app/upgradeupstreamaiter/aiter/csrc/cpp_itfs/pa/pa_v1.py", line 27, in compile
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return compile_template_op(
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/app/upgradeupstreamaiter/aiter/csrc/cpp_itfs/utils.py", line 213, in compile_template_op
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     compile_lib(src_file, folder, includes, sources, cxxflags)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/app/upgradeupstreamaiter/aiter/csrc/cpp_itfs/utils.py", line 108, in compile_lib
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     shutil.copytree(include, include_dir, dirs_exist_ok=True)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/lib/python3.10/shutil.py", line 559, in copytree
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks,
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]   File "/usr/lib/python3.10/shutil.py", line 513, in _copytree
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522]     raise Error(errors)
�[1;36m(VllmWorker rank=1 pid=145676)�[0;0m ERROR 07-10 09:03:26 [multiproc_executor.py:522] shutil.Error: [('/app/upgradeupstreamaiter/aiter/csrc/include/attention_asm_mla.h', '/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/attention_asm_mla.h', '[Errno 2] No such file or directory')]
hipcc -fPIC -DUSE_ROCM -DENABLE_FP8 -O3 -std=c++17 -DLEGACY_HIPBLAS_DIRECT -DUSE_PROF_API=1 -D__HIP_PLATFORM_HCC__=1 -D__HIP_PLATFORM_AMD__=1 -U__HIP_NO_HALF_CONVERSIONS__ -U__HIP_NO_HALF_OPERATORS__ -mllvm --amdgpu-kernarg-preload-count=16 -Wno-unused-result -Wno-switch-bool -Wno-vla-cxx-extension -Wno-undefined-func-template -fgpu-flush-denormals-to-zero -fno-offload-uniform-block -mllvm -enable-post-misched=0 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false --offload-arch=gfx942 -I/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include -c pa_v1_c805a96fb3d3023acaa0925ad6c4e1de.cpp -o pa_v1_c805a96fb3d3023acaa0925ad6c4e1de.o
In file included from pa_v1_c805a96fb3d3023acaa0925ad6c4e1de.cpp:1:
In file included from /root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/pa_v1.cuh:19:
In file included from /root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/pa_kernels.cuh:3:
In file included from /root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/pa_common.cuh:9:
In file included from /root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/quant_utils.cuh:19:
/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/vec_convert.h:78:1: warning: non-void function does not return a value [-Wreturn-type]
   78 | }
      | ^
/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/vec_convert.h:87:1: warning: non-void function does not return a value [-Wreturn-type]
   87 | }
      | ^
/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/vec_convert.h:96:1: warning: non-void function does not return a value [-Wreturn-type]
   96 | }
      | ^
In file included from pa_v1_c805a96fb3d3023acaa0925ad6c4e1de.cpp:1:
In file included from /root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/pa_v1.cuh:19:
In file included from /root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/pa_kernels.cuh:3:
In file included from /root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/pa_common.cuh:9:
In file included from /root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/quant_utils.cuh:19:
/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/vec_convert.h:78:1: warning: non-void function does not return a value [-Wreturn-type]
   78 | }
      | ^
/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/vec_convert.h:87:1: warning: non-void function does not return a value [-Wreturn-type]
   87 | }
      | ^
/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/vec_convert.h:96:1: warning: non-void function does not return a value [-Wreturn-type]
   96 | }
      | ^
3 warnings generated when compiling for gfx942.
3 warnings generated when compiling for gfx942.
In file included from pa_v1_c805a96fb3d3023acaa0925ad6c4e1de.cpp:1:
In file included from /root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/pa_v1.cuh:19:
In file included from /root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/pa_kernels.cuh:3:
In file included from /root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/pa_common.cuh:9:
In file included from /root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/quant_utils.cuh:19:
/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/vec_convert.h:78:1: warning: non-void function does not return a value [-Wreturn-type]
   78 | }
      | ^
/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/vec_convert.h:87:1: warning: non-void function does not return a value [-Wreturn-type]
   87 | }
      | ^
/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/vec_convert.h:96:1: warning: non-void function does not return a value [-Wreturn-type]
   96 | }
      | ^
3 warnings generated when compiling for host.
hipcc -shared pa_v1_c805a96fb3d3023acaa0925ad6c4e1de.o -o lib.so
In file included from pa_v1_c805a96fb3d3023acaa0925ad6c4e1de.cpp:1:
In file included from /root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/pa_v1.cuh:19:
In file included from /root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/pa_kernels.cuh:3:
In file included from /root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/pa_common.cuh:9:
In file included from /root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/quant_utils.cuh:19:
/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/vec_convert.h:78:1: warning: non-void function does not return a value [-Wreturn-type]
   78 | }
      | ^
/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/vec_convert.h:87:1: warning: non-void function does not return a value [-Wreturn-type]
   87 | }
      | ^
/root/.aiter/build/pa_v1_c805a96fb3d3023acaa0925ad6c4e1de/include/vec_convert.h:96:1: warning: non-void function does not return a value [-Wreturn-type]
   96 | }
      | ^
3 warnings generated when compiling for host.
hipcc -shared pa_v1_c805a96fb3d3023acaa0925ad6c4e1de.o -o lib.so

Operating System

Ubuntu 22.04.4 LTS (Jammy Jellyfish)

CPU

AMD EPYC 9474F 48-Core Processor

GPU

AMD Instinct MI300X

ROCm Version

ROCm 6.4

ROCm Component

No response

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions