Skip to content

Feature Request: Transition to the Windows HIP (ROCm) SDK #4

@neilopet

Description

@neilopet

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Optimization Brief: AMD Strix Halo (RDNA 3.5) Hardware Enhancements

Now that the Vulkan Sparse Allocator is stable and the upstream Qwen3.5 caching bug is resolved, we have an unconstrained dual-model engine. The next phase is optimizing llama.cpp to fully exploit the RDNA 3.5 Unified Memory Architecture (UMA) for our agent harness.

Please review these four hardware-level enhancements, ranked by implementation effort, and prepare to integrate them.


4. Transition to the Windows HIP (ROCm) SDK

The Target: Vulkan is a graphics API running translated SPIR-V compute shaders. AMD’s native compute language is HIP. Compiling llama.cpp on Windows with the HIP SDK links the engine directly to hipBLAS/rocBLAS, which contain hand-tuned, assembly-level microkernels written specifically for RDNA 3.5 architecture, yielding mathematically perfect register allocation.

Motivation

Effort: Very High (9/10)
Expected Improvement: 15% - 25% overall token generation throughput uplift (target: ~46-50 t/s).

Possible Implementation

The Implementation:
This requires completely swapping the build toolchain. You must bypass the Vulkan CMake parameters and compile exclusively for HIP.

Prerequisites for the environment:

  1. Install AMD HIP SDK for Windows.
  2. Install Strawberry Perl (required for the hipcc compiler wrapper).
  3. Ensure MSVC and Ninja are in the PATH.

The Build Script:

# Set compiler paths explicitly to the HIP SDK LLVM toolchain
$env:CC = "C:\Program Files\AMD\ROCm\5.7\bin\clang.exe"
$env:CXX = "C:\Program Files\AMD\ROCm\5.7\bin\clang++.exe"
$env:HIP_PATH = "C:\Program Files\AMD\ROCm\5.7"

# Note: gfx1151 is the specific microarchitecture target for Strix Halo / RDNA 3.5
cmake -G "Ninja Multi-Config" -S . -B build_hip `
    -DCMAKE_BUILD_TYPE=Release `
    -DLLAMA_HIPBLAS=ON `
    -DAMDGPU_TARGETS="gfx1151" `
    -DCMAKE_C_COMPILER=$env:CC `
    -DCMAKE_CXX_COMPILER=$env:CXX

cmake --build build_hip --config Release

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions