Skip to content

metal : move mm_id indices to shared mem#5982

Merged
ggerganov merged 1 commit intomasterfrom
gg/metal-mm-id-shared
Mar 10, 2024
Merged

metal : move mm_id indices to shared mem#5982
ggerganov merged 1 commit intomasterfrom
gg/metal-mm-id-shared

Conversation

@ggerganov
Copy link
Copy Markdown
Member

fix #5070

MoE models now support batch size of up to 4096 with Metal

@ggerganov ggerganov merged commit bb6d00b into master Mar 10, 2024
@ggerganov ggerganov deleted the gg/metal-mm-id-shared branch March 10, 2024 21:12
NeoZhangJianyu pushed a commit to NeoZhangJianyu/llama.cpp that referenced this pull request Mar 12, 2024
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ggml : support bs > 512 for Metal ggml_mul_mat_id

1 participant