Feature Request: Transition to the Windows HIP (ROCm) SDK

### Prerequisites

- [x] I am running the latest code. Mention the version if possible as well.
- [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md).
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggml-org/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

### Optimization Brief: AMD Strix Halo (RDNA 3.5) Hardware Enhancements

Now that the Vulkan Sparse Allocator is stable and the upstream Qwen3.5 caching bug is resolved, we have an unconstrained dual-model engine. The next phase is optimizing `llama.cpp` to fully exploit the RDNA 3.5 Unified Memory Architecture (UMA) for our agent harness.

Please review these four hardware-level enhancements, ranked by implementation effort, and prepare to integrate them.

---

#### 4. Transition to the Windows HIP (ROCm) SDK

**The Target:** Vulkan is a graphics API running translated SPIR-V compute shaders. AMD’s native compute language is HIP. Compiling `llama.cpp` on Windows with the HIP SDK links the engine directly to `hipBLAS`/`rocBLAS`, which contain hand-tuned, assembly-level microkernels written specifically for RDNA 3.5 architecture, yielding mathematically perfect register allocation.

### Motivation

**Effort:** Very High (9/10)
**Expected Improvement:** 15% - 25% overall token generation throughput uplift (target: ~46-50 t/s).

### Possible Implementation

**The Implementation:**
This requires completely swapping the build toolchain. You must bypass the Vulkan CMake parameters and compile exclusively for HIP.

**Prerequisites for the environment:**

1. Install AMD HIP SDK for Windows.
2. Install Strawberry Perl (required for the `hipcc` compiler wrapper).
3. Ensure MSVC and Ninja are in the PATH.

**The Build Script:**

```powershell
# Set compiler paths explicitly to the HIP SDK LLVM toolchain
$env:CC = "C:\Program Files\AMD\ROCm\5.7\bin\clang.exe"
$env:CXX = "C:\Program Files\AMD\ROCm\5.7\bin\clang++.exe"
$env:HIP_PATH = "C:\Program Files\AMD\ROCm\5.7"

# Note: gfx1151 is the specific microarchitecture target for Strix Halo / RDNA 3.5
cmake -G "Ninja Multi-Config" -S . -B build_hip `
    -DCMAKE_BUILD_TYPE=Release `
    -DLLAMA_HIPBLAS=ON `
    -DAMDGPU_TARGETS="gfx1151" `
    -DCMAKE_C_COMPILER=$env:CC `
    -DCMAKE_CXX_COMPILER=$env:CXX

cmake --build build_hip --config Release

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Transition to the Windows HIP (ROCm) SDK #4

Prerequisites

Feature Description

Optimization Brief: AMD Strix Halo (RDNA 3.5) Hardware Enhancements

4. Transition to the Windows HIP (ROCm) SDK

Motivation

Possible Implementation

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Feature Request: Transition to the Windows HIP (ROCm) SDK #4

Description

Prerequisites

Feature Description

Optimization Brief: AMD Strix Halo (RDNA 3.5) Hardware Enhancements

4. Transition to the Windows HIP (ROCm) SDK

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions