llama : fix compatibility with old 2 expert models by slaren · Pull Request #6735 · ggml-org/llama.cpp

slaren · 2024-04-18T01:54:49Z

The correct number of extra tensors needed is 3 per layer, not n_expert, so it wouldn't allocate enough tensors for models with 2 experts.

Fixes the second issue reported in #6379

llama : fix compatibility with old 2 expert models

ce80217

slaren mentioned this pull request Apr 18, 2024

ggml_new_object: not enough space in the context's memory pool (needed 3539648, available 3539280) #6379

Closed

This comment has been minimized.

Sign in to view

phymbert approved these changes Apr 18, 2024

View reviewed changes

ggerganov approved these changes Apr 18, 2024

View reviewed changes

ggerganov merged commit c71bfd7 into master Apr 18, 2024

slaren deleted the sl/moe-extra-tensors-fix branch April 18, 2024 11:08

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026

llama : fix compatibility with old 2 expert models (ggml-org#6735)

1d56ec2

phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request Apr 28, 2026

llama : fix compatibility with old 2 expert models (ggml-org#6735)

7f8e883

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : fix compatibility with old 2 expert models#6735

llama : fix compatibility with old 2 expert models#6735
ggerganov merged 1 commit intomasterfrom
sl/moe-extra-tensors-fix

slaren commented Apr 18, 2024

Uh oh!

This comment has been minimized.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

slaren commented Apr 18, 2024

Uh oh!

This comment has been minimized.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants