Skip to content

llama : default n_swa for phi-3#8931

Merged
ngxson merged 3 commits intoggml-org:masterfrom
ngxson:xsn/phi-3-default-swa
Aug 10, 2024
Merged

llama : default n_swa for phi-3#8931
ngxson merged 3 commits intoggml-org:masterfrom
ngxson:xsn/phi-3-default-swa

Conversation

@ngxson
Copy link
Copy Markdown
Contributor

@ngxson ngxson commented Aug 8, 2024

Related to #8627

phi3.attention.sliding_window is marked as required, this break many existing models as some users can't re-convert the new version (ref: ngxson/wllama#106)

This PR propose default value for certain models:

model n_layers n_ctx_train n_swa
Phi-3-mini-4k-instruct 32 4096 2047
Phi-3-medium-4k-instruct 40 4096 2047
Phi-3-mini-128k-instruct 32 (n_head_kv = 32) 131072 262144
Phi-3-medium-128k-instruct 40 131072 131072
Phi-3-small-8k-instruct 32 (n_head_kv = 8) 8192 none
Phi-3-small-128k-instruct 32 (n_head_kv = 8) 131072 none

@ngxson ngxson marked this pull request as ready for review August 8, 2024 12:22
@ngxson ngxson requested a review from ggerganov August 8, 2024 12:22
@ngxson ngxson changed the title default n_swa for phi-3 llama : default n_swa for phi-3 Aug 8, 2024
Comment thread src/llama.cpp Outdated
// default value for Phi-3-medium-128k-instruct
hparams.n_swa = 131072;
}
ml.get_key(LLM_KV_ATTENTION_SLIDING_WINDOW, hparams.n_swa, false);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to handle get_key returning false and hparams.n_swa remaining uninitialized to 0 by throwing an error, though probably never going to happen

Copy link
Copy Markdown
Contributor Author

@ngxson ngxson Aug 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially thought that we can have the case you mentioned with a phi-3-small model, but turns out it's not supported yet. (ref: #7439)

So I agree with what you said. The check is added in 2913e5f

@mofosyne mofosyne added the Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix label Aug 9, 2024
@ngxson ngxson added the merge ready A maintainer can use this label to indicate that they consider the changes final and ready to merge. label Aug 10, 2024
@ngxson ngxson merged commit 7eb2384 into ggml-org:master Aug 10, 2024
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
* default n_swa for phi-3

* fix

* double check swa
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
* default n_swa for phi-3

* fix

* double check swa
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* default n_swa for phi-3

* fix

* double check swa
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge ready A maintainer can use this label to indicate that they consider the changes final and ready to merge. Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants