Misc. bug: b8143 produces garbage Mac x86 Vulkan with AMD GPU

### Name and Version

I have a Mac x86 with Radeon 6900XT RDNA2 GPU with self-built `llamacpp` with Vulkan support. Not super fast, but usable since, at least from b6431, when things broke recently from 8143

8142 works and `mistral-11b-omnimix-bf16.Q8_0.gguf` produces sensible output


### Operating systems

Mac

### Which llama.cpp modules do you know to be affected?

llama-server, llama-cli

### Command line

```shell
cli or server producing garbage

% ./build/bin/llama-cli --ctx-size 8192 -m ~/shared/models/mistral-11b-omnimix-bf16.Q8_0.gguf -dev Vulkan0 -p "what is population of Russian capital"
ggml_metal_device_init: tensor API disabled for pre-M5 and pre-A19 devices
ggml_metal_library_init: using embedded metal library
ggml_metal_library_init: loaded in 0.028 sec
ggml_metal_rsets_init: creating a residency set collection (keep_alive = 180 s)
ggml_metal_device_init: GPU name:   MTL0
ggml_metal_device_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_device_init: GPU family: MTLGPUFamilyMetal3  (5001)
ggml_metal_device_init: simdgroup reduction   = true
ggml_metal_device_init: simdgroup matrix mul. = false
ggml_metal_device_init: has unified memory    = false
ggml_metal_device_init: has bfloat            = true
ggml_metal_device_init: has tensor            = false
ggml_metal_device_init: use residency sets    = true
ggml_metal_device_init: use shared buffers    = false
ggml_metal_device_init: recommendedMaxWorkingSetSize  = 17163.09 MB
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 6900 XT (MoltenVK) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 0 | matrix cores: none

Loading model...  


▄▄ ▄▄
██ ██
██ ██  ▀▀█▄ ███▄███▄  ▀▀█▄    ▄████ ████▄ ████▄
██ ██ ▄█▀██ ██ ██ ██ ▄█▀██    ██    ██ ██ ██ ██
██ ██ ▀█▄██ ██ ██ ██ ▀█▄██ ██ ▀████ ████▀ ████▀
                                    ██    ██
                                    ▀▀    ▀▀

build      : b8143-aa6f918c1
model      : mistral-11b-omnimix-bf16.Q8_0.gguf
modalities : text

available commands:
  /exit or Ctrl+C     stop or exit
  /regen              regenerate the last response
  /clear              clear the chat history
  /read               add a text file


> what is population of Russian capital

the capital of dévelopment ofץ

[ Prompt: 15.3 t/s | Generation: 41.0 t/s ]

```

### Problem description & steps to reproduce

running some question simpler than just "Hi" produces garbage
```
./build/bin/llama-cli --ctx-size 8192 -m ~/shared/models/mistral-11b-omnimix-bf16.Q8_0.gguf -dev Vulkan0 -p "what is population of Russian capital"
```

### First Bad Commit

b8143 broken, b8142 still works fine

### Relevant log output

<details>
<summary>Logs</summary>


```console
> what is population of Russian capital

the capital of dévelopment ofץ
```
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc. bug: b8143 produces garbage Mac x86 Vulkan with AMD GPU #20029

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: b8143 produces garbage Mac x86 Vulkan with AMD GPU #20029

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions