Skip to content

add wait() to make code stable#5895

Merged
NeoZhangJianyu merged 1 commit intoggml-org:masterfrom
NeoZhangJianyu:fix_ci_unstable
Mar 6, 2024
Merged

add wait() to make code stable#5895
NeoZhangJianyu merged 1 commit intoggml-org:masterfrom
NeoZhangJianyu:fix_ci_unstable

Conversation

@NeoZhangJianyu
Copy link
Copy Markdown
Contributor

@NeoZhangJianyu NeoZhangJianyu commented Mar 6, 2024

  1. add wait() to make code stable.
  2. use fp32 on oneMKL gemm_batch for better performance.
  3. add debug function.

Current performance reference:

GPU: 1 Arc 770
OS: ubuntu 22.04
Param: -mg 0 -sm none
Model: llama-2-7b.Q4_0.gguf

Avg: 30.66 tokens per second

@NeoZhangJianyu NeoZhangJianyu requested a review from airMeng March 6, 2024 03:33
Copy link
Copy Markdown
Contributor

@airMeng airMeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better leave some performance data for future reference

@NeoZhangJianyu
Copy link
Copy Markdown
Contributor Author

better leave some performance data for future reference

Yes, update it in first comment.

@airMeng
Copy link
Copy Markdown
Contributor

airMeng commented Mar 6, 2024

better leave some performance data for future reference

Yes, update it in first comment.

comparisons before and after?

@NeoZhangJianyu NeoZhangJianyu merged commit 8ced9f7 into ggml-org:master Mar 6, 2024
hazelnutcloud pushed a commit to hazelnutcloud/llama.cpp that referenced this pull request Mar 10, 2024
NeoZhangJianyu added a commit to NeoZhangJianyu/llama.cpp that referenced this pull request Mar 12, 2024
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants