ci: bench: add more ftype, fix triggers and bot comment by phymbert · Pull Request #6466 · ggml-org/llama.cpp

phymbert · 2024-04-03T19:53:18Z

Motivation

PR comment pops up in all PRs even unrelated to speed, it is a little bit distracting.
benchmark results vary probably because a fixed seed is not set
add more model ftypes

Proposition

reduce file workflow path trigger to only llama.cpp, ggml.c and cuda files.
add seed param in k6 script.js
reduce the comment to the first line, so it does not use too much space in the PR, add a warning notice
add q8_0 and f16 phi-2 quants
add more metrics in the commit status to later on show performance history

Tests

Tested here on a self-hosted GCP L4 ( sic ^) ) :

References

server: bench: continuous performance testing #6233 (comment)

@ggerganov please restart the github runner manager on the azure T4 node with ci: start-github-runner-manager.sh add model download phi f16 ci#3

- do not show the comment by default

phymbert · 2024-04-03T19:55:29Z

@ggerganov Georgi, as more and more models are MOE based, I suggest later on adding mixtral8x7b, thoughts ?

ggerganov

Yes, we can continue to scale with more models in the future. For now, let's give it some more time to see how the existing benchmarks perform and gather feedback about what's useful.

phymbert · 2024-04-04T09:49:21Z

Understood, please merge once you have restarted the github manager

ggerganov · 2024-04-04T09:57:17Z

Updated to master of https://github.com/ggml-org/ci and restarted

* ci: bench: change trigger path to not spawn on each PR * ci: bench: add more file type for phi-2: q8_0 and f16. - do not show the comment by default * ci: bench: add seed parameter in k6 script * ci: bench: artefact name perf job * Add iteration in the commit status, reduce again the autocomment * ci: bench: add per slot metric in the commit status * Fix trailing spaces

phymbert added 6 commits April 3, 2024 18:42

ci: bench: change trigger path to not spawn on each PR

0fb7bfa

ci: bench: add more file type for phi-2: q8_0 and f16.

22597a4

- do not show the comment by default

ci: bench: add seed parameter in k6 script

b996b00

ci: bench: artefact name perf job

a380b95

Add iteration in the commit status, reduce again the autocomment

04e1ce3

ci: bench: add per slot metric in the commit status

64c7534

phymbert added performance Speed related topics server/webui labels Apr 3, 2024

phymbert requested a review from ggerganov April 3, 2024 19:53

Fix trailing spaces

8685c2c

This comment was marked as off-topic.

Sign in to view

ggerganov approved these changes Apr 4, 2024

View reviewed changes

ggerganov merged commit 7a2c926 into ggml-org:master Apr 4, 2024

phymbert deleted the hp/server/bench/add-quants branch April 4, 2024 09:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: bench: add more ftype, fix triggers and bot comment#6466

ci: bench: add more ftype, fix triggers and bot comment#6466
ggerganov merged 7 commits intoggml-org:masterfrom
phymbert:hp/server/bench/add-quants

phymbert commented Apr 3, 2024 •

edited

Loading

Uh oh!

phymbert commented Apr 3, 2024 •

edited

Loading

Uh oh!

This comment was marked as off-topic.

ggerganov left a comment

Uh oh!

phymbert commented Apr 4, 2024

Uh oh!

ggerganov commented Apr 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

phymbert commented Apr 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Proposition

Tests

References

Uh oh!

phymbert commented Apr 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as off-topic.

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

phymbert commented Apr 4, 2024

Uh oh!

ggerganov commented Apr 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

phymbert commented Apr 3, 2024 •

edited

Loading

phymbert commented Apr 3, 2024 •

edited

Loading