infill : add download instructions for model by danbev · Pull Request #6626 · ggml-org/llama.cpp

danbev · 2024-04-12T05:59:52Z

This commit adds instructions on how to download a CodeLlama model using the hf.sh script. This will download the model and place it in the models directory which is the same model use later by the infill example.

This commit adds instructions on how to download a CodeLlama model using the `hf.sh` script. This will download the model and place it in the `models` directory which is the same model use later by the infill example. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

ggerganov · 2024-04-12T08:49:46Z


 ### Example

+Download a CodeLlama model:


Clarify that we download Codellama because it has infill support.

We can improve this by using latest CodeGemma models - I think they are smaller and also have infill support. It would be also a good exercise to verify that we support them

I've update clarifying the usage of CodeLlama now 👍

Regarding CodeGemma, I tried out a smaller model codegemma-2b-f16.gguf which did not produce a good result. I'll try a larger model, codegemma-7b-it-f16.gguf and see if it works (if I can run it). One thing I've noticed is that these are models are gated so I was not able to use the hf script to download them but had to manually download them.

Actually, looking into this a little closer it looks like CodeGemma is using different ids for special tokens where as the ones that are specified in llama.cpp currently are:

id special_prefix_id = 32007; id special_middle_id = 32009; id special_suffix_id = 32008; id special_eot_id = 32010;

For CodeGemma models perhaps this should be something like:

id special_prefix_id = 67; id special_middle_id = 68; id special_suffix_id = 69; id special_eot_id = 70;

Should there be a check for CodeGemma model name in llm_load_vocab to handle this perhaps, or what would be a good way to address this issue?

Ah yes, this should be fixed - likely we need to add these special tokens in the GGUF meta data and start using them. Could you verify that using the correct token ids (by hardcoding them) produces correct results?

Could you verify that using the correct token ids (by hardcoding them) produces correct results?

The following is the output when using the special token ids from CodeGemma:

./infill -t 10 -ngl 15 -m models/codegemma-7b-it-f16.gguf -c 4096 --temp 0 --repeat_penalty 1.0 -n 20 --in-prefix "def helloworld():\n print(\"hell" --in-suffix "\n print(\"goodbye world\")\n " ... ##### Infill mode ##### <|fim_prefix|> def helloworld():\n print("hell<|fim_suffix|> \n print("goodbye world")\n <|fim_middle|> o world")\n print("goodbye world")\n print("hello world")\<|file_separator|>

This looks better I think and more inline with that the CodeLlama model outputs.
I'd be happy to take a stab at adding these special tokens to the GGUF meta data, if that is alright?

Need to add these in the GGUF meta and then use inside llama.cpp.

Btw, the example that you posted seems kind of ok, but I would have expected it to after finishing the hello world print to create a new function def goodbyeworld():. There might be some other issues still

Need to add these in the GGUF meta and then use inside llama.cpp.

~~Could you give me a few pointer as to where this meta data is added? Is it added to gguf.py, like in constants.py similar to the other special tokens like BOS_ID, EOS_ID, UNK_ID etc?~~
I think I've figured this out and will dig in a little more and open a pull request.

There might be some other issues still

I'll take a closer look at this as part of adding the metadata and see if I can figure out what is wrong.
I just realized that I downloaded the instruction following (it) model which is not the code completion model 😞

Using the code completion CodeGemma I get the following output:

$ ./infill -t 10 -ngl 0 -m models/codegemma-7b-f16.gguf -c 4096 --temp 0.7 --repeat_penalty 1.1 -n 20 --in-prefix "def helloworld():\n print(\"hell" --in-suffix "\n print(\"goodbye world\")\n " ... ##### Infill mode ##### <|fim_prefix|> def helloworld():\n print("hell<|fim_suffix|> \n print("goodbye world")\n <|fim_middle|>o,world!")<|file_separator|>

I've opened #6689 with a suggestion.

Clarify the reason for using CodeLlama. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

* infill : add download instructions for model This commit adds instructions on how to download a CodeLlama model using the `hf.sh` script. This will download the model and place it in the `models` directory which is the same model use later by the infill example. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> * squash! infill : add download instructions for model Clarify the reason for using CodeLlama. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> --------- Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

ggerganov reviewed Apr 12, 2024

View reviewed changes

squash! infill : add download instructions for model

23d5814

Clarify the reason for using CodeLlama. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

ggerganov approved these changes Apr 12, 2024

View reviewed changes

ggerganov merged commit 4cc120c into ggml-org:master Apr 12, 2024

danbev deleted the infill-readme-update branch August 13, 2025 09:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

infill : add download instructions for model#6626

infill : add download instructions for model#6626
ggerganov merged 2 commits intoggml-org:masterfrom
danbev:infill-readme-update

danbev commented Apr 12, 2024

Uh oh!

ggerganov Apr 12, 2024

Uh oh!

danbev Apr 12, 2024

Uh oh!

danbev Apr 12, 2024

Uh oh!

ggerganov Apr 12, 2024

Uh oh!

danbev Apr 12, 2024

Uh oh!

ggerganov Apr 13, 2024 •

edited

Loading

Uh oh!

danbev Apr 13, 2024 •

edited

Loading

Uh oh!

danbev Apr 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

danbev commented Apr 12, 2024

Uh oh!

ggerganov Apr 12, 2024

Choose a reason for hiding this comment

Uh oh!

danbev Apr 12, 2024

Choose a reason for hiding this comment

Uh oh!

danbev Apr 12, 2024

Choose a reason for hiding this comment

Uh oh!

ggerganov Apr 12, 2024

Choose a reason for hiding this comment

Uh oh!

danbev Apr 12, 2024

Choose a reason for hiding this comment

Uh oh!

ggerganov Apr 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danbev Apr 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danbev Apr 15, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ggerganov Apr 13, 2024 •

edited

Loading

danbev Apr 13, 2024 •

edited

Loading