Add support for Chameleon by nopperl · Pull Request #8543 · ggml-org/llama.cpp

nopperl · 2024-07-17T14:46:15Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

This PR adds support for the Chameleon model. For now, this implementation only supports text->text inference and serves as base to implement the (more interesting) image->text, text->image and interleaved pipelines. However, such an implementation will probably require some changes to the CLI and internal architecture, so I suggest to do this in a separate PR.

Chameleon is based on the Llama-2 architecture with the following changes:

different (pre-)tokenizer
qk-norm
swin-norm

Note 1: in order to enable text->text inference, the image token logits are suppressed similar to the HF implementation. This needs to be removed when support for images is added.

Note 2: I implemented swin-norm, but I haven't tested it yet, as it is only used by Chameleon-30B.

To test it:

git clone https://huggingface.co/facebook/chameleon-7b
./convert-hf-to-gguf.py chameleon-7b
build/bin/llama-cli -m chameleon-7b/ggml-model-f16.gguf --temp 0.8 -s 1000 -n 50 -p "Language modeling is " -ngl 33

Output:

Language modeling is “the task of predicting the next word in a sequence of text, given the previous words.”

To implement a language model, we can use a neural network with a bidirectional LSTM layer and a softmax output layer.

Reference (requires transformers>=4.43.0.dev0):

from transformers import AutoModelForCausalLM, AutoTokenizer, set_seed
set_seed(1000)
model = AutoModelForCausalLM.from_pretrained("facebook/chameleon-7b", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("facebook/chameleon-7b")
prompt = "Language modeling is "
inputs = tokenizer.encode(prompt, return_pt=True)
out = model.generate(inputs, max_new_tokens=40)
tokenizer.decode(out)

Reference output:

Language modeling is “the task of predicting the next word in a sequence of text given the previous words.”

In other words, it's a machine learning model that takes a sequence of text as input

Partially addresses #7995.

nopperl · 2024-07-17T14:56:06Z

I have uploaded GGUFs to test this PR with here.

Co-authored-by: compilade <git@compilade.net>

Co-Authored-By: nopperl <54780682+nopperl@users.noreply.github.com>

nate-lrt · 2024-09-26T00:27:57Z

will this ever get added :(

nopperl · 2024-09-26T11:18:16Z

I think it would still be a good addition. I've resolved all conflicts with master now, so it should be ready to merge.

arch-btw · 2024-09-28T15:34:47Z

Thank you @nopperl looks like it got merged!

@compilade

* convert chameleon hf to gguf * add chameleon tokenizer tests * fix lint * implement chameleon graph * add swin norm param * return qk norm weights and biases to original format * implement swin norm * suppress image token output * rem tabs * add comment to conversion * fix ci * check for k norm separately * adapt to new lora implementation * fix layer input for swin norm * move swin_norm in gguf writer * add comment regarding special token regex in chameleon pre-tokenizer * Update src/llama.cpp Co-authored-by: compilade <git@compilade.net> * fix punctuation regex in chameleon pre-tokenizer (@compilade) Co-authored-by: compilade <git@compilade.net> * fix lint * trigger ci --------- Co-authored-by: compilade <git@compilade.net>

@compilade

* convert chameleon hf to gguf * add chameleon tokenizer tests * fix lint * implement chameleon graph * add swin norm param * return qk norm weights and biases to original format * implement swin norm * suppress image token output * rem tabs * add comment to conversion * fix ci * check for k norm separately * adapt to new lora implementation * fix layer input for swin norm * move swin_norm in gguf writer * add comment regarding special token regex in chameleon pre-tokenizer * Update src/llama.cpp Co-authored-by: compilade <git@compilade.net> * fix punctuation regex in chameleon pre-tokenizer (@compilade) Co-authored-by: compilade <git@compilade.net> * fix lint * trigger ci --------- Co-authored-by: compilade <git@compilade.net>

@compilade

* convert chameleon hf to gguf * add chameleon tokenizer tests * fix lint * implement chameleon graph * add swin norm param * return qk norm weights and biases to original format * implement swin norm * suppress image token output * rem tabs * add comment to conversion * fix ci * check for k norm separately * adapt to new lora implementation * fix layer input for swin norm * move swin_norm in gguf writer * add comment regarding special token regex in chameleon pre-tokenizer * Update src/llama.cpp Co-authored-by: compilade <git@compilade.net> * fix punctuation regex in chameleon pre-tokenizer (@compilade) Co-authored-by: compilade <git@compilade.net> * fix lint * trigger ci --------- Co-authored-by: compilade <git@compilade.net>

MasterScrat · 2024-12-05T15:32:35Z

@nopperl any plans to tackle image->text and text->image?

nopperl · 2024-12-19T15:27:37Z

@MasterScrat currently no plans, sorry for the late reply. AFAIK multimodal support would require a refactor of llama.cpp (#8010 (comment)). I'd love to work on it, but don't have the time right now.

@compilade

* convert chameleon hf to gguf * add chameleon tokenizer tests * fix lint * implement chameleon graph * add swin norm param * return qk norm weights and biases to original format * implement swin norm * suppress image token output * rem tabs * add comment to conversion * fix ci * check for k norm separately * adapt to new lora implementation * fix layer input for swin norm * move swin_norm in gguf writer * add comment regarding special token regex in chameleon pre-tokenizer * Update src/llama.cpp Co-authored-by: compilade <git@compilade.net> * fix punctuation regex in chameleon pre-tokenizer (@compilade) Co-authored-by: compilade <git@compilade.net> * fix lint * trigger ci --------- Co-authored-by: compilade <git@compilade.net>

nopperl added 8 commits July 15, 2024 12:12

convert chameleon hf to gguf

385c1a8

add chameleon tokenizer tests

568110a

fix lint

fc09437

implement chameleon graph

0453f7d

add swin norm param

654b1b3

return qk norm weights and biases to original format

c460d5c

implement swin norm

3d3523e

suppress image token output

758612a

github-actions Bot added the python python script changes label Jul 17, 2024

nopperl added 7 commits July 17, 2024 17:18

rem tabs

90766e1

Merge branch 'master' into chameleon

b16c09a

add comment to conversion

126201d

fix ci

da5e356

check for k norm separately

fa568f6

adapt to new lora implementation

f40cd20

fix layer input for swin norm

15260c5

mofosyne added the Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level label Jul 19, 2024

compilade reviewed Jul 21, 2024

View reviewed changes

Comment thread gguf-py/gguf/gguf_writer.py Outdated

Comment thread src/llama.cpp Outdated

Comment thread src/llama.cpp Outdated

Comment thread src/llama.cpp Outdated

ngxson mentioned this pull request Jul 21, 2024

Feature Request: Add support for new model conversion #8563

Closed

4 tasks

nopperl and others added 8 commits July 22, 2024 13:22

move swin_norm in gguf writer

6e0ded3

add comment regarding special token regex in chameleon pre-tokenizer

05f1385

Update src/llama.cpp

1e1e78a

Co-authored-by: compilade <git@compilade.net>

fix punctuation regex in chameleon pre-tokenizer (@compilade)

0ee896d

Co-authored-by: compilade <git@compilade.net>

Merge branch 'master' into chameleon

fa99dc2

fix lint

98ea5e7

Merge branch 'master' into chameleon

440b00a

trigger ci

d38a928

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Aug 15, 2024

Add support for Chameleon ggml-org#8543

e171663

Co-Authored-By: nopperl <54780682+nopperl@users.noreply.github.com>

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Aug 16, 2024

Add support for Chameleon ggml-org#8543

afe9ff8

Co-Authored-By: nopperl <54780682+nopperl@users.noreply.github.com>

Merge branch 'master' into chameleon

0a6a9e6

ggerganov approved these changes Sep 28, 2024

View reviewed changes

ggerganov merged commit 9a91311 into ggml-org:master Sep 28, 2024

arch-btw mentioned this pull request Sep 28, 2024

Feature Request: Support for Meta Chameleon 7B and 34B #7995

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Chameleon#8543

Add support for Chameleon#8543
ggerganov merged 24 commits intoggml-org:masterfrom
nopperl:chameleon

nopperl commented Jul 17, 2024 •

edited

Loading

Uh oh!

nopperl commented Jul 17, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nate-lrt commented Sep 26, 2024

Uh oh!

nopperl commented Sep 26, 2024

Uh oh!

arch-btw commented Sep 28, 2024

Uh oh!

MasterScrat commented Dec 5, 2024

Uh oh!

nopperl commented Dec 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

nopperl commented Jul 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nopperl commented Jul 17, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nate-lrt commented Sep 26, 2024

Uh oh!

nopperl commented Sep 26, 2024

Uh oh!

arch-btw commented Sep 28, 2024

Uh oh!

MasterScrat commented Dec 5, 2024

Uh oh!

nopperl commented Dec 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

nopperl commented Jul 17, 2024 •

edited

Loading