Clarify default MMQ for CUDA and LLAMA_CUDA_FORCE_MMQ flag by isaac-mcfadyen · Pull Request #8115 · ggml-org/llama.cpp

isaac-mcfadyen · 2024-06-25T16:58:55Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

Summary

In CUDA: use MMQ instead of cuBLAS by default #8075, MMQ was enabled by default on GPUs with int8 tensor core support.
A short description of the LLAMA_CUDA_FORCE_MMQ was added to the README. As it currently stands though, the message makes it seem like MMQ will not be used unless enabled with the flag.
The message says "flag forces MMQ to be enabled on GPUs without int8 support" but doesn't really say that it will be enabled by default on GPUs with int8 support.
This PR just adds a short blurb stating that MMQ is enabled by default on GPUs with int8 tensor core support, and that the flag forces it for all GPUs.

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

…8115) * Add message about int8 support * Add suggestions from review Co-authored-by: Johannes Gäßler <johannesg@5d6.de> --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

Add message about int8 support

b6cd699

mofosyne added the Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix label Jun 25, 2024

slaren approved these changes Jun 25, 2024

View reviewed changes

slaren requested a review from JohannesGaessler June 25, 2024 20:13

JohannesGaessler requested changes Jun 25, 2024

View reviewed changes

Comment thread README.md Outdated

Add suggestions from review

37ff709

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

JohannesGaessler approved these changes Jun 26, 2024

View reviewed changes

JohannesGaessler merged commit 8854044 into ggml-org:master Jun 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify default MMQ for CUDA and LLAMA_CUDA_FORCE_MMQ flag#8115

Clarify default MMQ for CUDA and LLAMA_CUDA_FORCE_MMQ flag#8115
JohannesGaessler merged 2 commits intoggml-org:masterfrom
isaac-mcfadyen:mmq-readme-update

isaac-mcfadyen commented Jun 25, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

isaac-mcfadyen commented Jun 25, 2024

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants