Fix perplexity computation, MQA/GQA models & models requiring `position_ids` by fxmarty · Pull Request #129 · huggingface/optimum-amd

fxmarty · 2024-04-10T12:27:32Z

As per title

fxmarty · 2024-04-10T12:31:35Z

@Giuseppe5, with this, comparing to Brevitas model_eval I do get better eval results using CUDA_VISIBLE_DEVICES=3 python quantize_llm.py --fuse-sequences (for brevitas, simply using ppl = model_eval(model, validation_dataset, args.seqlen)):

Computing perplexity...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 128/128 [00:02<00:00, 52.39it/s]
Perplexity (original model): 68.6707534790039
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 128/128 [00:01<00:00, 72.13it/s]
brevitas ppl: tensor(80.3409, device='cuda:0')

Which kind of makes sense as in optimum-amd a minimum context length is enforced.

fix perplexity computation

3afa387

fxmarty requested a review from Giuseppe5 April 10, 2024 12:31

fix MQA/GQA models & models using position_ids

6add109

fxmarty changed the title ~~Fix perplexity computation~~ Fix perplexity computation, MQA/GQA models & models requiring position_ids Apr 10, 2024

fix typo

3f9ed99

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix perplexity computation, MQA/GQA models & models requiring `position_ids`#129

Fix perplexity computation, MQA/GQA models & models requiring `position_ids`#129
fxmarty wants to merge 3 commits intomainfrom
fix-perplexity

fxmarty commented Apr 10, 2024

Uh oh!

fxmarty commented Apr 10, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fxmarty commented Apr 10, 2024

Uh oh!

fxmarty commented Apr 10, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant